Call for Papers

We are inviting submissions of short (max 4 pages) and long (max 9 pages) papers outlining new research, due August 22, 2025. We welcome all submissions that convincingly argue for why they further the field: i.e. which further our ability to use the internal states of neural networks to understand them.

We are extremely grateful to all who volunteer as reviewers, you can express interest here. We request but do not require that (co-)first authors of submitted papers volunteer as reviewers.

Details:

Strong empirical works will clearly articulate (i) specific falsifiable hypotheses, and how the evidence provided does and does not support them; or (ii) convincingly show clear practical benefits over well-implemented baselines.

Works that clearly document the strengths and weaknesses of their evidence, and what we can learn from this are welcomed, even if it weakens the narrative or conclusions remain inconclusive. Works that downplay or omit significant limitations will not be accepted.

Authors may find Neel Nanda’s advice on paper writing to be a helpful perspective, especially those new to writing mechanistic interpretability papers.

Topics of Interest

The field is young, and there are many exciting open questions. We are particularly interested in, but not limited to, the following directions: