Research | Dr. Kevin Baum

Research Group: RAIME

Since October 2024, I lead the independent research group Responsible AI and Machine Ethics (RAIME) at DFKI, embedded within the Neuro-Mechanistic Modeling department.

RAIME develops approaches to navigate normative uncertainty in the context of AI through practical justifications, by integrating normative reasons and ethical considerations into the development and deployment of AI systems — and into the decision-making of AI agents and agentic AI itself, not just as reward-based training signal or post-hoc constraints, but as core component of decision-making and agentic architectures. We’re particularly interested in developing the next generation of non-brittle and controllable agents and agentic systems that are reason-sensitive and responsive, i.e., they act for the right reasons, not “just” produce the right outputs.

Research Themes

Effective Human Oversight

What does it mean for humans to genuinely, effectively, and meaningfully oversee AI systems? We argue that simply having a “human in (or on) the loop” is insufficient — oversight must have the capacity, information, and authority to actually intervene. Achieving this and operationalizing it is far from trivial.

With colleagues at CERTAIN (Center for European Research in Trusted AI) and CPEC, we’ve developed interdisciplinary frameworks for understanding and operationalizing effective oversight, drawing on ideas from moral responsibility, signal detection theory, legal analysis, and empirical studies of human-AI interaction.

Key publications:

Sterz, Baum et al. (2024). “On the Quest for Effectiveness in Human Oversight: Interdisciplinary Perspectives.” FAccT 2024
Langer, Baum & Schlicker (2024). “Effective Human Oversight of AI-Based Systems: A Signal Detection Perspective.” Minds and Machines
Biewer, Baum et al. (2024). “Software Doping Analysis for Human Oversight.” Formal Methods in System Design

Machine Ethics & AI Alignment

How can we build AI agents that are sensitive and responsive to normative reasons? Our work on machine ethics and AI alignment moves beyond simple rule-following toward architectures where moral considerations are genuinely integrated into decision-making — touching questions of AI safety along the way.

The neuro-symbolic GRACE architecture (“Governor for Reason-Aligned Containment”, developed with quite some colleagues) implements reason-based decision-making in RL agents, addressing what we call the “flattening problem” — the tendency of standard approaches to collapse normative distinctions.

We also work on the conceptual foundations of AI alignment and AI safety, distinguishing them from adjacent fields (AI ethics, governance) and analyzing different alignment strategies. For a longer exposition of the ideas behind this work, see my essay The Need for Normative World Models for Real Alignment (work in progress).

Key publications:

Jahn, Muskalla, Dargasz, Schramowski & Baum (2026). “Breaking Up with Normatively Monolithic Agency with GRACE.” IASEAI 2026
Baum & Slavkovik (2025). “Aggregation Problems in Machine Ethics and AI Alignment.” AIES 2025
Steingrüber & Baum (2025). “Justifications for Democratizing AI Alignment and Their Prospects.” AISoLA 2025, Springer LNCS
Baum (2026). “Disentangling AI Alignment: A Structured Taxonomy Beyond Safety and Ethics.” AISoLA 2024 Post-Proceedings

Algorithmic Fairness & Trustworthy AI

What does “fairness” mean in algorithmic systems, and how can we operationalize it? We take a philosophically informed approach, examining how different fairness metrics encode different normative commitments — and how practitioners should navigate these choices.

More broadly, we’re interested in the cluster of properties called “Trustworthy AI” (which are often in conflict, spawning trade-offs and, thus, the necessity for positive normative choice), how they can be meaningfully managed under conditions of normative uncertainty, and ultimately can be certified and assessed. With colleagues, I’ve developed the Trustworthiness Assessment Model (TrAM) shedding light on the trustworthiness of AI systems.

Key publications:

Schlicker, Baum et al. (2025). “How Do We Assess the Trustworthiness of AI? Introducing the Trustworthiness Assessment Model (TrAM).” Computers in Human Behavior
Baum et al. (2025). “Taming the AI Monster: Monitoring of Individual Fairness for Effective Human Oversight.” SPIN 2024, Springer LNCS

Explainability & Perspicuous Computing

Explainability isn’t just a technical challenge — it’s a normative one. What explanations are owed, to whom, and why? We’ve argued that XAI should be understood through the lens of reason-giving: explanations serve to provide the reasons that justify a system’s outputs.

I’m a member of the Transregional Collaborative Research Centre 248 “Foundations of Perspicuous Software Systems” (CPEC) and remain closely involved with the Explainable Intelligent Systems (EIS) project.

Key publications:

Langer et al. (2021). “What Do We Want from Explainable Artificial Intelligence (XAI)? – A Stakeholder Perspective.” Artificial Intelligence
Baum, Mantel, Schmidt & Speith (2022). “From Responsibility to Reason-Giving Explainable Artificial Intelligence.” Philosophy & Technology
Baum, Hermanns & Speith (2019). “Towards a Framework Combining Machine Ethics and Machine Explainability.” CREST 2018

Major Projects

Current:

CERTAIN — Center for European Research in Trusted AI (Executive Board member)
MAC-MERLin — Multi-level Abstractions on Causal Modelling for Enhanced Reinforcement Learning
TRR 248 Foundations of Perspicuous Software Systems” (CPEC) — Center for Perspicuous Computing

Past:

Explainable Intelligent Systems (EIS) — Volkswagen Foundation (2019–2024)

PhD Supervision

As a Saarland University Associate Fellow (since December 2025), I have full PhD supervision rights in Computer Science. I welcome inquiries from prospective doctoral students interested in:

Machine ethics and normative reasoning in AI
AI alignment (technical and philosophical perspectives)
Human oversight and human-AI interaction
Algorithmic fairness (conceptual and technical)
Philosophy of AI

Contact: academia@kevinbaum.de