Cofounder and Chief Scientist at Sequent Research. Alignment will be solved, but not necessarily in time. Previously AISI, DeepMind, OpenAI, Google Brain, etc.
We are starting a new, nonprofit alignment organization, ⊢ Sequent Research, bringing together researchers previously on UK AISI’s Alignment Team, Timaeus, and elsewhere to research how to align superintelligence. We are hiring! 🧵
I have no details of OpenAI's Board’s reasons for firing Sam, and I am conflicted (lead of Scalable Alignment at Google DeepMind). But there is a large, very loud pile on vs. people I respect, in particular Helen Toner and Ilya Sutskever, so I feel compelled to say a few things.
Third, my prior is strongly against Sam after working for him for two years at OpenAI:
1. He was always nice to me.
2. He lied to me on various occasions
3. He was deceptive, manipulative, and worse to others, including my close friends (again, only nice to me, for reasons)
...but that seems inconsistent with the thoughtfulness of the people involved. At this point, mostly what feels right is withholding complete judgment until more information lands. I appreciate that will seem unreasonable to a lot of people, but it’s the best suggestion I have.
The “always nice to me” part puts me in a weird situation: most of the negative stuff I know about happened to others, and even when he lied to me it was about other people to try to drive wedges. So it feels rough for me to unilaterally share it.
Second, my prior is strongly in favor of Helen and Ilya (I know the other two less well). Ilya is a terrific ML researcher, I had a first-hand seat for two years of his views around safety improving, and my sense from others is that they kept improving after I left in 2019.
Helen is a terrific strategic and generalist researcher, is extremely thoughtful, and my interactions since I met her in 2018 have been uniformly great. She took the board seat with reservations...
As I mentioned, I don’t have any details of the Board’s reasons for their action, besides what is publicly known. I am very confused about what happened on Friday and over the weekend: from public information it seems like big tactical and strategic mistakes were made...
...I advised against it at the time as I thought the governance structure was broken, but she and other people I trust felt it was important to have more independent members on the Board (I’m not yet sure who was right in hindsight).
I am happy to announce that I will be joining the UK AI Safety Institute (AISI) soon as a Research Director. Over 2023 I have been very impressed with the progress made by the UK via the AI Safety Institute and AI Safety Summit, and I am excited to join the team!
1/ The AI Safety Institute has been in operation for almost eight months and I'm excited to announce some huge new hires. We have begun pre-deployment testing for potentially harmful capabilities on advanced AI systems. This is our third progress report: gov.uk/government/pub…
New alignment theory paper! We present a new scalable oversight protocol (prover-estimator debate) and a proof that honesty is incentivised at equilibrium (with large assumptions, see 🧵), even when the AIs involved have similar available compute.
Here’s a thread about why I joined the UK AI Safety Institute (@aisafetyinst) as a Research Director for technical safety and why I think other technical folks should consider roles here (aisi.gov.uk/careers):
Yoshua Bengio is looking for theory folk to join him to work on Bayesian approaches to AGI safety. I think this is a great opportunity: I've quite enjoyed the theory discussions I've had with Yoshua so far, and would love more work in this direction.