benthamite🔸
557 posts
push-pin = poetry. Effective Altruist.
Joined January 2020
- If you can get a better score than our human subjects did on any of these evals, send it to me and we will fly you out for an onsite interviewHow close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.
- Plant based chicken nuggets now outperform animal chicken nuggets in blind tase tests
- On Humanity's Last Exam, the justification for the answer to the first question being correct is that an AI says it's correct
- Replying to @benthamite_
- I've written a post about waves of #EffectiveAltruism, and how we might be arguably entering a third wave
- The reports of my death have been greatly exaggeratedIt's too bad "effective altruism" has been destroyed as a brand by SBF, because altruism is important and "effective" is probably the most important adjective you could qualify "altruism" with.
- I'm organizing a debate about whether we should push for a pause on AI development, with takes from the following people:
- YES I KNOW THAT HE'S MY EX BUT CAN'T TWO PEOPLE RECONNECT I ONLY SEE HIM AS A FRIEND i tripped and fell into his corporate restructuringBreaking: OpenAI board in discussions with Sam Altman to return as CEO trib.al/DTxl5tC
- As the head of the Center for Effective Altruism, I was dismayed to learn that some EAs go by their initials. I am hereby requiring everyone who cares about their impact to adopt a style of first name, last initial, all lowercase.
- Replying to @benthamite_
- This is also the only reason people disagree with meDecel counter-arguments so far: "Beff Jezos uses big words and has big muscles, ergo, he must be wrong!"















