Mapping the Mind of a Large Language Model ( www.anthropic.com ) I often see a lot of people with outdated understanding of modern LLMs....
Anthropic just published new research that successfully identified and mapped millions of human-interpretable concepts, called “features”, within the neural networks of Claude. ( www.anthropic.com )