Hannah Rose Kirk (@hannahrosekirk) / X

Hannah Rose Kirk

648 posts

Hannah Rose Kirk

@hannahrosekirk

AI researcher trying to make sense of the cyberspace 🤖 Workstream Lead @AISecurityInst. Uni of Ox PhD @oiioxford & Prev @Cambridge_Uni.

Joined June 2012

Pinned
Hannah Rose Kirk
@hannahrosekirk
Dec 11, 2024
A real honour and career dream that PRISM has won a @NeurIPSConf best paper award! 🌈 One year ago I was sat in a 13,000+ person audience of NeurIPs '23 having just finished data collection. Safe to say I've gone from feeling #stressed to very #blessed 😁
NeurIPS Conference
@NeurIPSConf
Dec 11, 2024
Announcing the NeurIPS 2024 Best Paper Awards: blog.neurips.cc/2024/12/10/ann…
80K
Hannah Rose Kirk
@hannahrosekirk
Nov 23, 2022
Gutted @DeepMind has paused internship cycles for this year (espesh after making it to final round🥲) Internships are a great way for early stage academics to get a feel for industry - the tech hiring freeze is hitting hard😑 Buuut I'm now back on the market so hmu with ideas👀😅
Hannah Rose Kirk
@hannahrosekirk
Apr 25, 2024
Today we're launching PRISM, a new resource to diversify the voices contributing to alignment. We asked 1500 people around the world for their stated preferences over LLM behaviours, then we observed their contextual preferences in 8000 convos with 21 LLMs arxiv.org/abs/2404.16019
120K
Hannah Rose Kirk
@hannahrosekirk
Apr 21, 2024
Determined to write my entire PhD in this sunny corner in my slippers 🤘😎
58K
Hannah Rose Kirk
@hannahrosekirk
Aug 23, 2025
Listen up all talented early-stage researchers! 👂🤖 We're hiring for a 6-month residency in my team at @AISecurityInst to assist cutting-edge research on how frontier AI influences humans! It's an exciting & well-paid role for MSc/PhD students in ML/AI/Psych/CogSci/CompSci 🧵
32K
Hannah Rose Kirk
@hannahrosekirk
May 28, 2025
Why do human–AI relationships need socioaffective alignment? As AI evolves from tools to companions, we must seek systems that enhance rather than exploit our nature as social & emotional beings. Published today in @Nature Humanities & Social Sciences! nature.com/articles/s4159…
20K
Hannah Rose Kirk
@hannahrosekirk
Apr 23, 2024
Published in Nature Machine Intelligence today, our new article explores the trade-offs of personalised alignment in large language models ⚖️ Personalisation has potential to democratise decisions over how LLMs behave, but brings its own set of risks... nature.com/articles/s4225…
34K
Hannah Rose Kirk
@hannahrosekirk
Nov 21, 2024
Research Question 1: Who's going to @NeurIPSConf ? Research Question 2: Who wants to come to the inaugural @aisafetyinst party? 👀
23K
Hannah Rose Kirk
@hannahrosekirk
Oct 21, 2024
3 Truths (& No Lies): 1. It's my bday today!🍰 2. I'm now a Research Scientist at @AISafetyInst evaluating psychological & social capabilities of frontier AI🤖 3. Still chippin away at my PhD (year 4!) but proud that last year I hit 1,000+ citations & 10,000+ dataset downloads😎
16K
Hannah Rose Kirk
@hannahrosekirk
Aug 2, 2022
🗣️Interested in football? 👀 We (@turinginst x @Ofcom) have analysed over 2 mil tweets from the 21-22 English Premier League season⚽ We created state-of-the-art AI to detect and track abuse towards players 🤖 Here's a 🧵on our methods and findings \n tinyurl.com/35n9enyn
Hannah Rose Kirk
@hannahrosekirk
Feb 17, 2023
✨New preprint (w/ Jakob Mökander, @jonasschuett and @floridi) ✨ In this paper, we propose a policy framework for auditing LLMs by breaking down responsibilities at the governance-, model- and application-level. arxiv.org/abs/2302.08500 🧵
29K
Hannah Rose Kirk
@hannahrosekirk
Sep 26, 2024
Wahoo PRISM will officially be taking a trip to @NeurIPSConf this year as an oral presentation 🤩 (and my first ever 10/10 in a conference review process 🤯)
Hannah Rose Kirk
@hannahrosekirk
Apr 25, 2024
Today we're launching PRISM, a new resource to diversify the voices contributing to alignment. We asked 1500 people around the world for their stated preferences over LLM behaviours, then we observed their contextual preferences in 8000 convos with 21 LLMs arxiv.org/abs/2404.16019
29K
Hannah Rose Kirk
@hannahrosekirk
May 16, 2022
🚨 New paper and datasets! 🚨 After sitting on my hands for many months 😬 I'm delighted that our #Hatemoji paper is going to @naaclmeeting! 😍🤩😎🆒 In a nutshell 🥜it uses human-and-model-in-the-loop learning 🤖🤝🙆 to tackle emoji-based hate A 🧵 on all our new resources 1/
Hannah Rose Kirk
@hannahrosekirk
Jun 12, 2024
🌎Introducing LINGOLY, our new reasoning benchmark that stumps even top LLMs (best models only reach ~35% accuracy)🥴 In a colab between @UniofOxford, @Stanford and UK Linguistic Olympiad puzzle authors, we stress test LLMs on over 90 low-resource and extinct languages...
23K