Natural Language Processing
Responsible AI and NLP

Building Safe, Factual, Fair and Empathetic AI Agents
As AI agents become widely deployed in real-world settings, ensuring they are safe, factual and fair is critical. Responsible AI systems should not only generate accurate and unbiased information, but also resist misuse and balance supportive interaction with safeguards against emotional dependency.
At INSAIT, we study responsible AI from multiple complementary perspectives. Our research spans social bias detection and mitigation, hallucination detection and factuality verification, jailbreaking and related defenses, and the joint optimization of empathy, safety, and factuality.
Our recent work provides a comprehensive social bias evaluation framework that analyzes social dimensions and the effects of weight-only and weight-activation quantization on LLMs’ social biases. With ChartAttack focusing on misleading charts, we developed a framework evaluating how multimodal large language models (MLLMs) can be misused to generate misleading charts at scale, highlighting the need for stronger robustness and security considerations in AI-based chart generation.
Faculty & Mentors involved in this research area:
Iryna Gurevych
Yuxia Wang
Advancing Language Models

Post-Training Strategies and Beyond Auto-Regressive Architectures
Post-training is a key stage of large language model development, aimed at improving task-specific performance and aligning models with human values.
At INSAIT, we focus on understanding how commonly used post-training methods compare, and when to use them, for example in Aletheia. We are also interested in improving individual components of the LLM post-training recipe to balance training cost and downstream performance, as well as novel language modelling architectures that move beyond purely auto-regressive modeling.
Faculty & Mentors involved in this research area:
Iryna Gurevych
Yuxia Wang
Personal Digital Twins

Continuously Learning Agents that Model the User
Personal Digital Twins are AI agents that continuously learn from user interactions to model individual preferences, knowledge and rewards. Such systems aim at becoming personalized cognitive extensions of individuals, capturing their knowledge, decision patterns and values to assist, collaborate and act on their behalf.
At INSAIT, we are developing AI You, a demo system that learns to mimic users’ response styles, personal expert knowledge and decision-making patterns over time. Our research focuses on continual learning, long-term memory and personalization under privacy constraints.
The central goal is to address two major bottlenecks in AI agent development: (1) the lack of high-quality cutting-edge expert knowledge and application-specific optimization data, and (2) the scarcity of reliable reward feedback. Our user-centered paradigm contributes both newly generated domain expertise and personalized reward signals from real individuals.
Faculty & Mentors involved in this research area:
Yuxia Wang