our @deepmind paper is out in PNAS!
we use john rawls' veil of ignorance as a method of choosing principles to align AI 👰♀️🤖
participants behind the veil are more likely to choose principles that prioritize the worst off, driven by fairness concerns
pnas.org/doi/10.1073/pn…
Saffron Huang
1,142 posts
how shall we live together?
societal impacts researcher @AnthropicAI • ex @GoogleDeepMind @AISecurityInst⋅ @collect_intel co-founder • views mine
- I’m delighted to share Letters to a Young Technologist, an essay collection that has come out of many months of reading, discussing and writing with an incredible, thoughtful group of young technologists. letterstoayoungtechnologist.com
- im obsessed with this little ~10 yr old boy quizzing his tiny ~5 yr old brother on the train: 10yo: "if you were the prime minister, would you use the country's wealth for yourself or for the country?" 5yo: "country" 10yo: "good... good... i'll make a world leader of you"
- Life update! I'm joining Anthropic's Societal Impacts team as a research scientist in September. I'll be shifting to a part-time role at @collect_intel, with the amazing @zarinahagnew taking over as research director.
- What irritates me about the approach taken by the AI 2027 report looking to "accurately" predict AI outcomes is that I think this is highly counterproductive for good outcomes. They say they don't want this scenario to come to pass, but their actions---trying to make scary
- Fresh job ad for a research engineer on my team (Societal Impacts) at @AnthropicAI -- check this out if you love scalable systems and accelerating research. :) Feel free to ask me any questions.
- My piece in @palladiummag 👇 The mythos of attending Harvard involves grand narratives of important contributions and public responsibility. In reality students are largely trained in managerialism and marginal optimisation rather than as “citizen-leaders”Harvard prides itself as the training ground for American elites. But that goal has given way to striving managerialism, myopic career goals, and a stunted appetite for risk. By @saffronhuang palladiummag.com/2020/07/27/har…
- Really proud and excited to release this work on empirically measuring AI values “in the wild” — understanding, analyzing and taxonomizing what values guide model outputs in real interactions with real users. There is a lot of work on training models to follow particularNew Anthropic research: AI values in the wild. We want AI models to have well-aligned values. But how do we know what values they’re expressing in real-life conversations? We studied hundreds of thousands of anonymized conversations to find out.
- Personal update: I just started my first full-time job as a research engineer at @DeepMind. Interning here on the multi-agent team last year was fascinating and taught me a great deal, and I am beyond excited to get back into things!
- I updated my personal website! I felt like it was pretty hard to explore before, and I wanted to actually properly highlight the work/ideas that I want people to read and that I stand behind. Will keep tweaking, but have a look. :) saffronhuang.com
- Two days ago I almost drowned in a sensory deprivation flotation tank when my hair was sucked into the filtration system. I wasn’t told about much of the actual float procedure nor an emergency button. No one meant to check in on me. saffronhuang.com/post/a-sensory…
- this is a really cool role!!! if you're a full stack SWE who cares about HCI / human agency / education APPLY!!!!!!
- thrilled to be on the 2024 #Time100AI list! thanks @TIME for recognizing @collect_intel's work, and to @divyasiddarth for your partnership and brilliance <3








