We applied GPT-4 to interpretability — automatically proposing explanations for GPT-2’s 300k neurons — and found neurons responding to concepts like similes, “things done correctly,” or expressions of certainty. We aim to use Al to help us understand Al: openai.com/research/langu…pic.twitter.com/knCUxnL5CY