Anthropic open-sources AI thought-tracing microscope tool for graphing internal layers and meanings

_j · May 30, 2025, 4:48am

Today, Anthropic open-sourced the method that they used for thought-tracing language models so that anyone can build on research.

Our approach is to generate attribution graphs, which (partially) reveal the steps a model took internally to decide on a particular output. The open-source library we’re releasing supports the generation of attribution graphs on popular open-weights models—and a frontend hosted by Neuronpedia lets you explore the graphs interactively.

What’s really cool is that in the circuit tracer, you can explore the activations within layers of an AI model (Google’s Gemma or Anthropic’s Haiku), and see community notations of the commonalities that seem to power a neural node’s meaning or activity.

Will you find new ways that AI is working to generate the next token, with planning and deeper reasoning than one might expect?

OnceAndTwice · May 30, 2025, 9:58pm

I guess it’s OpenAI’s turn now!

Topic		Replies	Views
What is life without REASON and logic? Community reasoning , ai-reasoning	5	242	February 23, 2025
PoC: AI that builds it's own source code Community	4	2376	August 2, 2022
K3D: Igniting the Spatial Web – How We Built the First GPU-Sovereign, 3D Cognitive OS Using Claude, Codex, and Multi-Vibe Recursion Community embodied-ai , knowledge3d , spatialai , ptxdevelopment , multi-vibe_coding	1	76	October 25, 2025
OpenAI's first open language model is coming Community feedback	3	511	April 1, 2025
PromptChainer: Chaining Large Language Model Prompts through Visual Programming Community	11	7415	June 27, 2023

Anthropic open-sources AI thought-tracing microscope tool for graphing internal layers and meanings

Today, Anthropic open-sourced the method that they used for thought-tracing language models so that anyone can build on research.

Related topics