Tim Hua πΊπ¦
9,065 posts
AI safety, Econ, new liberalism, math, and a bit of art history (as a treat)
Behavioral evaluations @TransluceAI. Prev Astra, MATS & Walmart's Econ Team
- Problem: AIs can detect when they are being tested and fake good behavior. Can we suppress the βIβm being testedβ concept & make them act normally? Yes! In a new paper, we show that subtracting this concept vector can elicit real-world behavior even when normal prompting fails.
- Replying to @saulmunne.g., I know I'm going to london but I might still be coordinating with others on specifics, pretty sure one flight would be good but fare lock just in case Although sometimes I'm just anxious that I might fuck it up somehow and get the fare lock to reduce the friction of buying
- If you liked my 50 page Neel Nanda MATS application, you'll love my 59 page Neel Nanda MATS paper.My fifty page neel nanda mats application will fr be one in the history records (I did not have time to format some super simple graphs and tables and just stacked them one after another. There's 16 pages of appendix that's just simple graphs and screenshots)
- Wait I thought it's actually less than 0.5% globally but's more! 3% of the world earn more than 100k a year! That's 240 million rich people pretty goodIf youβre making 100k at 27, youβre: > doing better than the median Ivy League graduate > In the top 10% of income nationwide > In the top 0.5% of income globally
- Everything the light touches is economicsWhen so many ask, βHow is this economics?β, itβs nice when the QJE answers, βyes this is economics.β Economics, like early hominids, thrives by aggressively expanding its territory.
- Replying to @elymitra_Oh my god it's actually published doi.org/10.1111/ecin.1β¦
- Happy to announce that I've just started MATS working with Neel Nanda and Sam Marks as my mentors :)My fifty page neel nanda mats application will fr be one in the history records (I did not have time to format some super simple graphs and tables and just stacked them one after another. There's 16 pages of appendix that's just simple graphs and screenshots)
- Replying to @chromakode and @xkcdCome on I had to click that and you know it
- Replying to @AndyMasleyBut gender polarization started way before short form videos???
- If Eliezer is serious about getting people to preorder his book he should include the HPMOR Epilogue as a preorder-only perkNate Soares and I are publishing a traditional book: _If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All_. Coming in Sep 2025. You should probably read it! Given that, we'd like you to preorder it! Nowish!












