Saffron Huang (@saffronhuang) / X

Saffron Huang

1,142 posts

Saffron Huang

@saffronhuang

how shall we live together? societal impacts researcher @AnthropicAI • ex @GoogleDeepMind @AISecurityInst⋅ @collect_intel co-founder • views mine

Joined April 2013

Saffron Huang
@saffronhuang
Apr 24, 2023
our @deepmind paper is out in PNAS! we use john rawls' veil of ignorance as a method of choosing principles to align AI 👰‍♀️🤖 participants behind the veil are more likely to choose principles that prioritize the worst off, driven by fairness concerns pnas.org/doi/10.1073/pn…
102K
Saffron Huang
@saffronhuang
Aug 16, 2021
I’m delighted to share Letters to a Young Technologist, an essay collection that has come out of many months of reading, discussing and writing with an incredible, thoughtful group of young technologists. letterstoayoungtechnologist.com
Saffron Huang
@saffronhuang
Dec 5, 2023
im obsessed with this little ~10 yr old boy quizzing his tiny ~5 yr old brother on the train: 10yo: "if you were the prime minister, would you use the country's wealth for yourself or for the country?" 5yo: "country" 10yo: "good... good... i'll make a world leader of you"
28K
Saffron Huang
@saffronhuang
Aug 16, 2024
Life update! I'm joining Anthropic's Societal Impacts team as a research scientist in September. I'll be shifting to a part-time role at @collect_intel, with the amazing @zarinahagnew taking over as research director.
41K
Saffron Huang
@saffronhuang
Apr 3, 2025
What irritates me about the approach taken by the AI 2027 report looking to "accurately" predict AI outcomes is that I think this is highly counterproductive for good outcomes. They say they don't want this scenario to come to pass, but their actions---trying to make scary
51K
Saffron Huang
@saffronhuang
Oct 29, 2024
Fresh job ad for a research engineer on my team (Societal Impacts) at @AnthropicAI -- check this out if you love scalable systems and accelerating research. :) Feel free to ask me any questions.
job-boards.greenhouse.io
Anthropic
64K
Saffron Huang
@saffronhuang
Jul 28, 2020
My piece in @palladiummag 👇 The mythos of attending Harvard involves grand narratives of important contributions and public responsibility. In reality students are largely trained in managerialism and marginal optimisation rather than as “citizen-leaders”
PALLADIUM Magazine
@palladiummag
Jul 27, 2020
Harvard prides itself as the training ground for American elites. But that goal has given way to striving managerialism, myopic career goals, and a stunted appetite for risk. By @saffronhuang palladiummag.com/2020/07/27/har…
Saffron Huang
@saffronhuang
Apr 21, 2025
Really proud and excited to release this work on empirically measuring AI values “in the wild” — understanding, analyzing and taxonomizing what values guide model outputs in real interactions with real users. There is a lot of work on training models to follow particular
Anthropic
@AnthropicAI
Apr 21, 2025
New Anthropic research: AI values in the wild. We want AI models to have well-aligned values. But how do we know what values they’re expressing in real-life conversations? We studied hundreds of thousands of anonymized conversations to find out.
30K
Saffron Huang
@saffronhuang
Dec 8, 2020
Personal update: I just started my first full-time job as a research engineer at @DeepMind. Interning here on the multi-agent team last year was fascinating and taught me a great deal, and I am beyond excited to get back into things!
Saffron Huang
@saffronhuang
Jun 6, 2025
I updated my personal website! I felt like it was pretty hard to explore before, and I wanted to actually properly highlight the work/ideas that I want people to read and that I stand behind. Will keep tweaking, but have a look. :) saffronhuang.com
18K
Saffron Huang
@saffronhuang
Sep 14, 2020
Two days ago I almost drowned in a sensory deprivation flotation tank when my hair was sucked into the filtration system. I wasn’t told about much of the actual float procedure nor an emergency button. No one meant to check in on me. saffronhuang.com/post/a-sensory…
Saffron Huang
@saffronhuang
Oct 17, 2025
this is a really cool role!!! if you're a full stack SWE who cares about HCI / human agency / education APPLY!!!!!!
job-boards.greenhouse.io
Anthropic
25K
Saffron Huang
@saffronhuang
Sep 5, 2024
thrilled to be on the 2024 #Time100AI list! thanks @TIME for recognizing @collect_intel's work, and to @divyasiddarth for your partnership and brilliance <3
TIME100 AI 2024: Saffron Huang and Divya Siddarth
From time.com
18K
Saffron Huang
@saffronhuang
Dec 12, 2021
she’s gorgeous @crypto_coven