Shalev (@Shalev

Shalev

2,376 posts

Shalev

@Shalev_lif

do androids dream of electric sheep? building something new, prev @VectorInst @UofT | co-creator of STEVE-1, Multi-Agent Verification

Joined September 2017

Shalev
@Shalev_lif
Dec 15, 2024
Best poster moment at #NeurIPS2024
377K
Shalev
@Shalev_lif
Jun 28, 2025
The neural network objective function is a very complicated objective function. It's very non convex, and there are no mathematical guarantees whatsoever about its success. And so if you were to speak to somebody who studies optimization from a theoretical point of view, they
204K
Shalev
@Shalev_lif
Jul 19, 2025
Replying to @NTFabiano
This research was funded by the international organization of first born children
17K
Shalev
@Shalev_lif
Jan 25, 2025
A new replication of DeepSeek's RL results! Here are my notes and some quick thoughts: Method: - Uses PPO instead of GRPO (DeepSeek-R1), still works - Data is 8K (query, final answer) examples from MATH - Rule-based reward modelling (no neural reward) - Initialize model to
Junxian He
@junxian_he
Jan 25, 2025
We replicated the DeepSeek-R1-Zero and DeepSeek-R1 training on 7B model with only 8K examples, the results are surprisingly strong. 🚀 Starting from Qwen2.5-Math-7B (base model), we perform RL on it directly. No SFT, no reward model, just 8K MATH examples for verification, the
44K
Shalev
@Shalev_lif
Oct 23, 2024
Great to see @geoffreyhinton at the @VectorInst office today. Here @michaelrzhang is presenting his work on qualitative eval of LLMs! Very cool to have a Nobel Laureate + Turing Award winner around the campus.
31K
Shalev
@Shalev_lif
Dec 13, 2024
My question to @ilyasut at NeurIPS 2024: Do LLMs generalize multi-hop reasoning out-of-distribution?
00:00
55K
Shalev
@Shalev_lif
Feb 28, 2025
Hot off the Servers 🔥💻 --- we’ve found a new approach for scaling test-time compute! Multi-Agent Verification (MAV) scales the number of verifier models at test-time, which boosts LLM performance without any additional training. Now we can scale along two dimensions: by
46K
Shalev
@Shalev_lif
Dec 16, 2024
Absolutely stacked panel at the System-2 Reasoning at Scale workshop at NeurIPS with Josh Tenenbaum, @MelMitchell1, @fchollet, @jaseweston, @DBahdanau, @dawnsongtweets, and @Yoshua_Bengio (with @nouhadziri moderating). An amazing end to the conference. Will add notes below.
39K
Shalev
@Shalev_lif
Jun 8, 2025
Replying to @ns123abc
Uh, she’s a third-year PhD. Many of the most influential papers in AI have been written by PhD students… Also, a paper is written by a team. The first author usually did most of the actual work during the project (ie, writing the code, running the experiments, etc.).
5.8K
Shalev
@Shalev_lif
Feb 20, 2024
Replying to @karpathy
This reminds me of a meme @_jasonwei posted a while back! That is, once you play with these models so much you kind of develop your own mini test suite to gain intuition of its performance.
23K
Shalev
@Shalev_lif
Dec 13, 2024
@ilyasut giving a talk at the NeurIPS 2024 Test of Time awards! Will add more photos below, throughout the talk.
9.9K
Shalev
@Shalev_lif
Dec 16, 2024
Replying to @Shalev_lif
See the full paper by @shreyaskapur, @jenner_erik, and Stuart Russel!
arxiv.org
Diffusion On Syntax Trees For Program Synthesis
Large language models generate code one token at a time. Their autoregressive generation process lacks the feedback of observing the program's output. Training LLMs to suggest edits directly can...
11K
Shalev
@Shalev_lif
Dec 22, 2024
In a few years PhDs won’t be coding much. They’ll have a fleet of agents coding up, running, and tuning their experiments. At that time, the most valuable skill will be deep expertise, as suggested by @RogerGrosse.
Dan Roy
@roydanroy
Dec 21, 2024
Replying to @roydanroy and @tunguz
In all seriousness, PhDs today will have tools so powerful that previous generations won’t know what to think of them. I think it is the most exciting time to be working. Just don’t work in an old way.
21K
Shalev
@Shalev_lif
Sep 21, 2023
🥳 Great news! Our paper STEVE-1 has been accepted at #NeurIPS 2023 as a spotlight! I'm so proud to have worked on this project with my amazing collaborators @keirp1 @SirrahChan @jimmybajimmyba @SheilaMcIlraith ✈️ Very excited to present our work in New Orleans! ✈️ Project
00:00
19K