Today we’re announcing a finding that breaks a core assumption in AI: that bigger models are harder to understand.
We show the opposite. When interpretability is built into training, models become MORE understandable as they become more capable.
00:00








