News
ATHENE papers at EMNLP 2025
Five papers by ATHENE researchers have been accepted for presentation at this year's Empirical Methods in Natural Language Processing (EMNLP) conference. EMNLP is one of the leading international scientific conferences in the fields of natural language processing and artificial intelligence. Organised annually by the Association for Computational Linguistics (ACL), it brings together researchers from academia and industry to present the latest results, methods and applications in NLP. EMNLP is considered one of the three most important conferences for research on empirical methods in machine language understanding, including machine learning, language generation and analysis, and deep learning methods for text data.

The following papers have been accepted:
CodeSSM: Towards State Space Models for Code Understanding
Authors: Hweta Verma, Abhinav Anand and Mira Mezini
In this paper, the researchers introduce CodeSSM — the world's first comprehensively tested AI model based on state space models (SSMs) for understanding and analysing software. CodeSSM was developed as part of an ATHENE project. Unlike widely used Transformer models, CodeSSM is highly efficient, requiring significantly less training data and memory. It learns exceptionally quickly (high sample efficiency) and can reliably process even very long program texts. A particular success: CodeSSM has outperformed comparable Transformers in many classic tasks, especially in vulnerability detection.
CodeSSM was developed as part of the ATHENE project.
More about the paper
Preemptive Detection and Correction of Misaligned Actions in LLM Agents
Authors: Haishuo Fang, Xiaodan Zhu and Iryna Gurevych
This paper presents InferAct, a system that detects when an AI agent is about to perform an incorrect or potentially dangerous action. Examples include completing an unwanted online transaction or executing an erroneous command in the system. It uses the ability of large language models to understand human intentions to check whether the AI's behaviour remains consistent with the user's goal. In this way, InferAct helps to prevent errors at an early stage, making AI agents safer and more trustworthy.
This paper was written as part of the ATHENE project, "Safeguarding LLMs against Misleading Evidence Attacks (SafeLLMs)", which falls under the Reliable and Verifiable Information through Secure Media (REVISE) research area.
More about the paper
Turning Logic Against Itself: Probing Model Defenses Through Contrastive Questions
Authors: Rachneet Sachdeva, Rima Hazra, Iryna Gurevych
Große Sprachmodelle wie ChatGPT oder GPT-4 können durch gezielte Formulierungen zu problematischen oder sogar gefährlichen Antworten verleitet werden, selbst wenn sie eigentlich auf sicheres Verhalten trainiert wurden. In dem Paper zeigen die Forschenden, wie solche „logischen Täuschungen“ funktionieren, und stellen mit POATE ein neues Verfahren vor, das Schwachstellen in der Argumentationslogik von KI-Systemen aufdeckt. Auf Basis dieser Erkenntnisse entwickeln die Forschenden außerdem Methoden, mit denen sich die Modelle besser gegen derartige Manipulationen absichern lassen.
This paper was written as part of the ATHENE project, "Safeguarding LLMs against Misleading Evidence Attacks (SafeLLMs)", which falls under the Reliable and Verifiable Information through Secure Media (REVISE) research area.
More about the paper
Droid: A Resource Suite for AI-Generated Code Detection
Authors: Daniil Orel, Indraneil Paul, Iryna Gurevych and Preslav Nakov
AI systems are increasingly writing program code, but it is often difficult to tell whether it was written by humans or machines. In their paper, the researchers present DroidCollection, the most comprehensive open data collection to date for detecting AI-generated code. They also demonstrate how such systems can be deliberately deceived. The researchers are also developing DroidDetect, a robust detection method that reliably identifies manipulative code variants, thereby contributing to cybersecurity in software development.
This paper was written as part of the ATHENE project, "Trustworthy and Explainable AI-generated Text Detection (TXAITD)", which falls under the Reliable and Verifiable Information through Secure Media (REVISE) research area.
More about the paper
Other papers by ATHENE researchers accepted at EMNLP include:
Judging Quality Across Languages
Authors: Mehdi Ali, Manuel Brack, Max Lübbering, Elias Wendt, Abbas Goher Khan, Richard Rutmann, Alex Jude, Maurice Kraus, Alexander Arno Weber, David Kaczér, Florian Mai, Lucie Flek, Rafet Sifa, Nicolas Flores-Herr, Joachim Köhler, Patrick Schramowski, Michael Fromm, Kristian Kersting
In their paper "LLM-as-a-judge", the researchers present a new LLM-as-a-judge approach for selecting high-quality training data for AI systems. A language model is used to evaluate the quality of text in 35 languages, keeping the best material and filtering out the rest. This is important because the quality and balance of training data directly determine how fair, accurate, and multilingual future AI models can be.
More about the paper
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
Authors: Jakub Macina, Nico Daheim, Ido Hakimi, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning
Authors: David Dinucu-Jianu, Jakub Macina, Nico Daheim, Ido Hakimi, Iryna Gurevych, Mrinmaya Sachan
From 4 to 9 November, the researchers will present their papers at EMNLP 2025 in Suzhou, China.
show all news
