Scalable Trustworthy AI

Creating scalable and trustworthy AI with human guidance

Overview

AI is no longer a research curiosity. It is reshaping how we live and work. To fully exploit its benefits, we must address critical gaps in trustworthiness.

Current foundational models like LLMs have critical trustworthiness problems: they hallucinate false information, fail at continual learning, resist knowledge editing (making GDPR compliance impractical), leak private information embedded in parameters, and require prohibitive compute for training and personalisation. These issues are blocking the widespread adoption of AI and the productivity revolution it promises.

Our approach: Knowledge-Intelligence Separation. Just as the code-data separation in the 1960s enabled the modern software industry, we believe this separation is the key to unlocking AI’s full potential. When knowledge is stored in interpretable, editable external modules while intelligence (reasoning, generalisation) remains in the model, we enable faster customisation, training data attribution by design, and knowledge editing and unlearning .

Our work spans a range of interconnected areas:

We are not alone in this effort. Many research labs worldwide contribute to Trustworthy AI. Our group finds its uniqueness by striving for working solutions that are widely applicable and can be deployed at scale. We thus name our group Scalable Trustworthy AI. For impact at scale, we commit ourselves to the following principles:

For prospective students: You might be interested in our internal curriculum and guidelines for a PhD program: Principles for a PhD Program.

STAI logo
KAIST logo
Tübingen AI logo
Uni Tübingen logo
IMPRS-IS logo
ELLIS logo

Members

Seong Joon Oh

Associate Professor

Elisa Nguyen

PhD Student

Arnas Uselis

PhD Student

Sohyung Kim

PhD Student

Stefano Woerner

PhD Student

Yejin Kim

Research Intern

Ankit Sonthalia

PhD Student

Bryan Truong

PhD Student

Lennart Bramlage

Collaborating PhD Student

Jihyeok Jung

MSc student

Bora Kargi

MSc Student

Philipp Davydov

MSc Student

Seokwon Jung

MSc Student

Luca Füger

MSc student

Fabian Morelli

MSc Student

Alumni

Elif Akata

PhD Student

Michael Kirchhof

Collaborating PhD Student

Evgenii Kortukov

MSc Student

Johannes Bertram

Research Assistant

Publications

Image

Dynamics Reveals Structure: Challenging the Linear Propagation Assumption

arXiv 2026
Image

CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

ICLR 2026
Image

DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

ICLR 2026
Image

Dr.LLM: Dynamic Layer Routing for LLMs

ICLR 2026
Image

Enhancing Multi-Image Understanding through Delimiter Token Scaling

ICLR 2026
Image

SelfReflect: Can LLMs Communicate Their Internal Answer Distribution?

ICLR 2026
Image

Un-Attributability: Computing Novelty From Retrieval & Semantic Similarity

arXiv 2025
Image

Diffusion Classifiers Understand Compositionality, but Conditions Apply

NeurIPS D&B 2025
Image

On the Rankability of Visual Embeddings

NeurIPS 2025
Image

Overcoming Domain Limitations in Open-vocabulary Segmentation

NeurIPS 2025
Image

Does Data Scaling Lead to Visual Compositional Generalization?

ICML 2025
Image

Do Deep Neural Network Solutions Form a Star Domain?

ICLR 2025
Image

Intermediate Layer Classifiers for OOD Generalization

ICLR 2025
Image

Decoupled Finetuning for Domain Generalizable Semantic Segmentation

ICLR 2025
Image

Are We Done with Object-Centric Learning?

SCSL @ ICLR 2025
Image

DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

arXiv 2025
Image

Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles

SCSL @ ICLR 2025
Image

Playing repeated games with Large Language Models

Nature Human Behaviour 2025
Image

Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks

NeurIPS D&B (Spotlight) 2024
Image

Studying Large Language Model Behaviors Under Realistic Knowledge Conflicts

CoLM 2024
Image

Towards User-Focused Research in Training Data Attribution for Human-Centered Explainable AI

arXiv 2024
Image

Scalable Ensemble Diversification for OOD Generalization and Detection

arXiv 2024
Image

Pretrained Visual Uncertainties

arXiv 2024
Image

A Bayesian Perspective On Training Data Attribution

NeurIPS 2023
Image

Exploring Practitioner Perspectives On Training Data Attribution Explanations

NeurIPS XAI in Action Workshop 2023
Image

ID and OOD Performance Are Sometimes Inversely Correlated on Real-world Datasets

NeurIPS 2023
Image

URL: A Representation Learning Benchmark for Transferable Uncertainty Estimates

NeurIPS D&B 2023
Image

Neglected Free Lunch -- Learning Image Classifiers Using Annotation Byproducts

ICCV 2023
Image

Scratching Visual Transformer's Back with Uniform Attention

ICCV 2023
Image

Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs

ICML 2023
Image

URL: A Representation Learning Benchmark for Transferable Uncertainty Estimates

UAI-EAI Best Student Paper 2023
Image

ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO

ECCV 2022
Image

Dataset Condensation via Efficient Synthetic-Data Parameterization

ICML 2022
Image

Weakly Supervised Semantic Segmentation Using Out-of-Distribution Data

CVPR 2022
Image

Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective

ICLR 2022

Openings

Postdoc

Expectations

We expect postdocs to be independent researchers who can manage multiple research threads in the lab through supervision, while pursuing their own first-author agenda. A research vision broader than a single project is expected.

Application process

  1. Send an email to Seong Joon Oh with your CV and research statement attached
  2. Coffee chat with Seong Joon to figure out initial fit
  3. On-site interview day (we support travel for on-site visits; remote interview possible if travel is not feasible)
    • Job talk: Present your prior work to the entire lab (30 minutes + discussion)
    • 1-on-1 interviews: Meet individually with 4+ lab members (on-site: scheduled throughout the day; remote: candidate reaches out to members and arranges meetings individually)
    • Future research discussion: Chat with Seong Joon about research directions at the intersection of your expertise and our vision
    • Lab lunch/dinner: Informal time to get to know the team
  4. Offer

PhD

Expectations

We expect PhD students to run their own first-author projects, with possible collaborations with both senior and junior members inside and outside the lab.

Application process

  1. Send an email to Seong Joon Oh with your CV and research statement attached
  2. Coffee chat with Seong Joon to figure out initial fit
  3. Half-day interview
    • Job talk: Present your prior work to the entire lab (30 minutes + discussion)
    • 1-on-1 interviews: Meet individually with 2 lab members (on-site: scheduled throughout the day; remote: candidate reaches out to members and arranges meetings individually)
  4. Apply to the grad school with Seong Joon Oh’s supervision intent via KAIST Graduate Admissions
  5. Offer

MSc

Expectations

We expect MSc students to run their own first-author projects, with possible collaborations with both senior and junior members inside and outside the lab.

Application process

  1. Send an email to Seong Joon Oh with your CV and research statement attached
  2. Coffee chat with Seong Joon to figure out initial fit
  3. Interview: 30 min + 30 min with Seong Joon
    • First half: Present your prior work (aim for 10 minutes, leaving 20 minutes for discussion)
    • Second half: Discuss future research ideas at the intersection of your expertise and our vision
  4. Apply to the grad school with Seong Joon Oh’s supervision intent via KAIST Graduate Admissions
  5. Offer

Internship (Pre-MSc)

Expectations

We expect interns to participate in a predefined research agenda as a co-author, working closely with their PhD student host.

Application process

  1. Send an email to the relevant PhD student (cc: Seong Joon Oh) with your CV and research statement attached
  2. Coffee chat with the PhD student
  3. Interview: 30 min + 30 min with the PhD student (Seong Joon participates optionally)
    • First half: Present your prior work (aim for 10 minutes, leaving 20 minutes for discussion)
    • Second half: Discuss future research ideas at the intersection of your expertise and our vision
  4. Offer