Nomination Voting

Dec 15th

Discussion

Jan 20th

Final Voting

Feb 5th

245"How could I have thought that faster?"

mesaoptimizer

Review

1510. CAST: Corrigibility as Singular Target

Max Harms

Review

313Laziness death spirals

PatrickDFarley

Review

Customize

Quick Takes

Linch1h102

What are people's favorite arguments/articles/essays trying to lay out the simplest possible case for AI risk/danger? Every single argument for AI danger/risk/safety I’ve seen seems to overcomplicate things. Either they have too many extraneous details, or they appeal to overly complex analogies, or they seem to spend much of their time responding to insider debates. I might want to try my hand at writing the simplest possible argument that is still rigorous and clear, without being trapped by common pitfalls. To do that, I want to quickly survey the field so I can learn from the best existing work as well as avoid the mistakes they make.

faul_sname10h450

anaguma

Follow-up to this post, wherein I was shocked to find that Claude Code failed to do a low-context task which took me 4 hours and involved some skills I expected it would have significant advantages [1] . I kept going to see if Claude Code could eventually succeed. What happened instead was that it built a very impressive-looking 4000 LOC system to extract type and dependency injection information for my entire codebase and dump it into a sqlite database. To my shock this the tool Claude built [2] actually worked. I ended up playing with the system Claude built for two days, uncovering and ticketing all sorts of bugs in the codebase I hadn't been aware of. And then I realized that the bugs I was uncovering weren't of the type that I was actually looking for for the task I was immediately trying to do, and that if I wanted bugs to fix we already have a backlog that we're not going to get through any time soon no matter how much AI help we have. So anyway, Claude was able to do a reasonable job of figuring out what endpoint sequences could cause an issue. It struggled to figure out how to invoke the framework to make a mock HTTP request[^3], but once it had a template to work off of, it was able to make good progress. On what I expected to be the hardest part of the task, Claude actually did quite well once it had a template to work off of. It was able (with some re-prompting necessary when it declared the task finished early) to write successfully failing tests for 5 of the 7 cases I had successfully written a test for, as well as one of the four "I am pretty sure this is an issue but I can't figure out how to expose it" cases. I also learned a few tricks of my own in seeing how Claude tackled a couple of the cases. All that said, I definitely came away from this experiment with a strong intuition for exactly how it could take 20% longer to do things when you have LLM coding agents assisting you. 1. The specific task was programmatically checking through an ent

Kaj_Sotala3h132

DirectedEvolution, Vladimir_Nesov

So I read another take on OpenAI's finances and was wondering, does anyone know why Altman is doing such a gamble on collecting enormous investments into new models in the hopes that they'll get sufficiently insane profits to make it worthwhile? Even ignoring the concerns around alignment etc., there's still the straightforward issue of "maybe the models are good and work fine but aren't good enough to pay back the investment". Even if you did expect scaling to probably bring in huge profits, naively it'd still be wiser to pick a growth strategy that didn't require your company to become literally the most profitable company in the history of all companies or go bankrupt. The obvious answer is something like "he believes they’re on the way to ASI and whoever gets there first, wins the game", but I'm not sure if it makes sense even under that assumption - his strategy requires not only getting to ASI first, but never once faltering on the path there. Even if ASI is really imminent but it only takes like two years longer than he expected, that alone might be enough that OpenAI is done for. He could have raised much more conservative investment and still been in the game - especially since much of the current arms race is plausibly a response to the sums OpenAI has been raising.

Karl Krueger11m20

I see the word "ablate" a lot more often than I used to. I think you used to have to be a dermatologist to ablate things, but now you can do it as an AI researcher or even a shrimp farmer.

faul_sname9h102

Back when I was in school and something came up on a test that required knowledge or skills I didn't have, I would often find the closest thing I did know how to do, and then do that thing even though it's not what the question asked for, in the hopes of getting partial credit. Looking back, I'm sure that the teachers grading my tests were fully aware of what I was doing, and yet the strategy did work out for me often enough to be worth doing. Anyway, working with LLM coding agents gives me sympathy for my teachers back then. LLMs are capable of doing a significant number of astonishing things. If you ask them to do something that is not on that list but that does resemble something on that list, you may get an artifact so beautiful and impressive that at first you don't notice it's not the artifact you asked for, but is instead the artifact the LLM knows how to create.

leogao1d260

anaguma, Cole Wyeth, and 1 more

ilya's AGI predictions circa 2017 (Musk v. Altman, Dkt. 379-40):

ryan_greenblatt2d582

habryka, StellaAthena, and 5 more

On X/twitter Jerry Wei (Anthropic employee working on misuse/safeguards) wrote something about why Anthropic ended up thinking that training data filtering isn't that useful for CBRN misuse countermeasures:

Your Feed

Linch1h102

faul_sname10h450

anaguma

Kaj_Sotala3h132

DirectedEvolution, Vladimir_Nesov

Karl Krueger11m20

I see the word "ablate" a lot more often than I used to. I think you used to have to be a dermatologist to ablate things, but now you can do it as an AI researcher or even a shrimp farmer.

faul_sname9h102

leogao1d260

anaguma, Cole Wyeth, and 1 more

ilya's AGI predictions circa 2017 (Musk v. Altman, Dkt. 379-40):

ryan_greenblatt2d582

habryka, StellaAthena, and 5 more

LESSWRONG
LW

The 2024 Review
Discussion Phase

The 2024 Review

Discussion Phase

Quick Takes

LESSWRONG
LW

The 2024 Review
Discussion Phase

The 2024 Review

Discussion Phase

Quick Takes

The 2024 ReviewDiscussion Phase

The 2024 Review

Discussion Phase

Quick Takes

The 2024 ReviewDiscussion Phase

The 2024 Review

Discussion Phase

Quick Takes

The 2024 Review
Discussion Phase

The 2024 Review
Discussion Phase