Gabriel Mukobi (@gabemukobi) / X

Gabriel Mukobi

512 posts

Gabriel Mukobi

@gabemukobi

AI security researcher. Opinions are my own.

San Francisco, CA

gabrielmukobi.com

Joined September 2017

Gabriel Mukobi
@gabemukobi
Jul 27, 2024
🦅I'm elated to join the technical team at the U.S. AI Safety Institute @NIST! AISI is a diverse team of experts in AI/ML, tech policy, and more, and I feel we have a fantastic opportunity to help the United States lead on the science, standards, and coordination of AI safety.
16K
Gabriel Mukobi
@gabemukobi
Apr 20, 2024
Proud to start this month as a research fellow at 🟪@RANDCorporation to advance technical AI governance and in the fall as a CS PhD student at 🐻@UCBerkeley advised by @JacobSteinhardt and @dawnsongtweets! 🏛️I'm also in Washington, DC, until late August if anyone wants to meet!
5.2K
Gabriel Mukobi
@gabemukobi
Nov 27, 2023
Replying to @DisruptiveBytes and @stats_feed
Was this written by a language model?
2.6K
Gabriel Mukobi
@gabemukobi
Jul 23, 2024
🧑🏽‍💻I started a personal blog (separate from my AI strategy blog)! The first post is on "ML Safety Research Advice," or advice for careers in empirical ML research that might help AI safety. Also aimed at AI governance researchers who want to learn more about ML safety. 1/3
1.7K
Gabriel Mukobi
@gabemukobi
Aug 5, 2024
📝New blog post, "Four Phases of AGI," on my personal AI strategy blog! In this post, I propose a framework for thinking about AGI progression and its implications for #AIGovernance. Check it out!⤵️ 1/15
1.6K
Gabriel Mukobi
@gabemukobi
Feb 5, 2024
Replying to @ToughSf
Uhhh does this detonate the bombs too?
11K
Gabriel Mukobi
@gabemukobi
May 22, 2024
🛡️AI risk management needs defense in depth--not just guardrails or controlling access to frontier models, but also societal adaptation. I'm excited for our new paper to contribute to this landscape and for follow up research and policy to help society adapt to advanced AI!
Markus Anderljung
@Manderljung
May 22, 2024
Increasingly advanced AI systems will diffuse into society. How do we manage the accompanying risks? In our a paper, we explore Societal Adaptation to Advanced AI: reducing harm from diffusion of AI capabilities by intervening to avoid, defend against, and remedy harmful use.
1.2K
Gabriel Mukobi
@gabemukobi
Aug 2, 2024
Happy to have contributed to the Safetywashing paper! We can and should scientifically assess how well safety benchmarks track safety goals 📈
Dan Hendrycks
@hendrycks
Aug 1, 2024
Do AI safety benchmarks actually measure safety progress? We find ~50% do not, showing safety research is fairly dysfunctional. We hope this work replaces vague arguments with scientific analysis to determine if a line of research makes DL systems safer. arxiv.org/abs/2407.21792
917
Gabriel Mukobi
@gabemukobi
Jan 10, 2024
⚔️📈 Super excited to finally release our paper "Escalation Risks from Language Models in Military and Diplomatic Decision-Making" on arXiv!
Max Lamparth
@MLamparth
Jan 10, 2024
Do LLMs lead to more escalation in high-stake international and military decision-making? Our new paper studies five off-the-shelf models and their behavior as autonomous agents in real-world conflict scenarios! A🧵
arxiv.org
Escalation Risks from Language Models in Military and Diplomatic...
Governments are increasingly considering integrating autonomous AI agents in high-stakes military and foreign-policy decision-making, especially with the emergence of advanced generative AI models...
1.5K
Gabriel Mukobi
@gabemukobi
Dec 11, 2023
Ready for NeurIPS! 🔥🌱🌊 Lmk if you want to meet up this week!
848
Gabriel Mukobi
@gabemukobi
Jul 23, 2024
📜Delighted to have contributed to "Open Problems in Technical AI Governance!" Pure ML safety and alignment may look pretty doomed these days, but ML, hardware, and security researchers can instead contribute to hundreds of open technical questions that improve AI governance!
Anka Reuel | @ankareuel.bsky.social
@AnkaReuel
Jul 23, 2024
Our new paper "Open Problems in Technical AI Governance" led by @ben_s_bucknall & me is out! We outline 89 open technical issues in AI governance, plus resources and 100+ research questions that technical experts can tackle to help AI governance efforts🧵 t.ly/Y-mQ1
895
Gabriel Mukobi
@gabemukobi
May 15, 2024
Replying to @MichaelTrazzi
Lol I wish I was that good at prediction and and not just unfortunate luck 🙃
1.7K
Gabriel Mukobi
@gabemukobi
Jun 10, 2024
Super excited for our paper to contribute to the emerging research landscape around a scientific, transparent, and predictable understanding of AI model evaluations! 📈
Rylan Schaeffer
@RylanSchaeffer
Jun 10, 2024
❤️‍🔥❤️‍🔥Excited to share our new paper ❤️‍🔥❤️‍🔥 **Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?** w/ @haileysch__ @BrandoHablando @gabemukobi @varunrmadan @herbiebradley @ai_phd @BlancheMinerva @sanmikoyejo arxiv.org/abs/2406.04391 1/N
1.5K
Gabriel Mukobi
@gabemukobi
Dec 23, 2018
I got an offer from Google, and I took it. I'm elated to be interning as a Googler in Engineering Practicum this summer! instagram.com/p/BruKVPjHoLox…