Mechanistic Interpretability Workshop

This is the archived NeurIPS 2025 site. See the current ICML 2026 workshop →

NeurIPS 2025

Sunday, December 7, 2025
San Diego Convention Centre · Room 30A-E

Attended? Give feedback on the workshop!

Call for Papers

EDIT: The call for papers is now closed, thanks all for your submissions!

We are inviting submissions of short (max 4 pages) and long (max 9 pages) papers outlining new research, due August 22, 2025 (EDIT: Call for papers closed). We welcome all submissions that convincingly argue for why they further the field: i.e. which further our ability to use the internal states of neural networks to understand them. Submit here.

We are extremely grateful to all who volunteer as reviewers, you can express interest here. We request but do not require that (co-)first authors of submitted papers volunteer as reviewers.

Details:

Strong empirical works will clearly articulate (i) specific falsifiable hypotheses, and how the evidence provided does and does not support them; or (ii) convincingly show clear practical benefits over well-implemented baselines.

Works that clearly document the strengths and weaknesses of their evidence, and what we can learn from this are welcomed, even if it weakens the narrative or conclusions remain inconclusive. Works that downplay or omit significant limitations will not be accepted.

Authors may find Neel Nanda’s advice on paper writing to be a helpful perspective, especially those new to writing mechanistic interpretability papers.

Topics of Interest

The field is young, and there are many exciting open questions. We are particularly interested in, but not limited to, the following directions:

Keynote Speakers

Chris Olah

Chris Olah

Interpretability Lead and Co-founder, Anthropic

Been Kim

Been Kim

Senior Staff Research Scientist, Google DeepMind

Sarah Schwettmann

Sarah Schwettmann

Co-founder, Transluce

ICML 2024 Workshop ICML 2024 Social

The first Mechanistic Interpretability Workshop (ICML 2024).

Organizing Committee

Neel Nanda

Neel Nanda

Senior Research Scientist, Google DeepMind

Andrew Lee

Andrew Lee

Post-doc, Harvard

Andy Arditi

Andy Arditi

PhD Student, Northeastern University

Jemima Jones

Jemima Jones

Operations Lead

Stefan Heimersheim

Stefan Heimersheim

Member of Technical Staff, FAR.AI

Anna Soligo

Anna Soligo

PhD Student, Imperial

Martin Wattenberg

Martin Wattenberg

Professor, Harvard University & Principal Research Scientist, Google DeepMind

Atticus Geiger

Atticus Geiger

Lead, Pr(Ai)²R Group

Julius Adebayo

Julius Adebayo

Founder and Researcher, Guide Labs

Kayo Yin

Kayo Yin

3rd year PhD student, UC Berkeley

Fazl Barez

Fazl Barez

Senior Research Fellow, Oxford Martin AI Governance Initiative

Lawrence Chan

Lawrence Chan

Researcher, METR

Matthew Wearden

Matthew Wearden

London Director, MATS

Questions? Email neurips2025@mechinterpworkshop.com

Curve detector visualization

What are those beautiful rainbow flower things?

These are visualizations of "curve detector" neurons from early mechanistic interpretability research. Learn more in the Curve Detectors article on Distill.