Log inSign up
Zack Ankner
365 posts
user avatar
Zack Ankner
@ZackAnkner
Prev @MIT.
zackankner.com
Joined September 2019
472
Following
1,401
Followers
  • Pinned
    user avatar
    Zack Ankner
    @ZackAnkner
    Aug 22, 2024
    Excited to announce our new work: Critique-out-Loud (CLoud) reward models. CLoud reward models first produce a chain of thought critique of the input before predicting a scalar reward, allowing reward models to reason explicitly instead of implicitly! arxiv.org/abs/2408.11791
    Image
    GIF
    71K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Jun 3, 2024
    New paper where we explore using a small LM’s perplexity to prune the pretraining data for larger LMs. We find that small LMs can prune data for up to 30x larger LMs, data pruning works in the overtrained and data-constrained regimes, and more!
    arXiv logo
    arxiv.org
    Perplexed by Perplexity: Perplexity-Based Data Pruning With Small...
    In this work, we investigate whether small language models can determine high-quality subsets of large-scale text datasets that improve the performance of larger language models. While existing...
    73K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Sep 4, 2023
    My EMNLP paper got desk-rejected post-rebuttal because I posted it to arxiv 25 minutes after the anonymity deadline. I was optimistic about our reviews, so I spent a whole week while visiting my family writing rebuttals and coding experiments to respond.
    user avatar
    Naomi Saphra
    @nsaphra
    Sep 4, 2023
    Just got a desk reject, post-rebuttals, for a paper being submitted to arxiv <30 min late for the anonymity deadline. I talk about how the ACL embargo policy hurts junior researchers and makes ACL venues less desirable for NLP work. I don’t talk about the pointless NOISE it adds.
    105K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Feb 9, 2024
    Excited to announce Hydra decoding! 🚀 We introduce sequential dependence in Medusa decoding and achieve up to a 1.31x and 2.71x improvement in throughput as compared to Medusa and baseline decoding! Paper: arxiv.org/abs/2402.05109 Github: github.com/zankner/Hydra
    Image
    GIF
    21K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Nov 7, 2023
    arxiv.org/abs/2310.01889 "Ring Attention with Blockwise Transformers for Near-Infinite Context" TLDR: Ring Attention: A distributed algorithm to efficiently compute exact attention over arbitrarily long sequence lengths. Details in thread 👇
    arXiv logo
    arxiv.org
    Ring Attention with Blockwise Transformers for Near-Infinite Context
    Transformers have emerged as the architecture of choice for many state-of-the-art AI models, showcasing exceptional performance across a wide range of AI applications. However, the memory demands...
    12K
  • user avatar
    Zack Ankner
    @ZackAnkner
    May 24, 2023
    Should we really use constant masking rates for BERT pretraining? We introduce dynamic masking rate schedules, a simple but effective method for improving masked language modeling (MLM) pretraining. Paper: zackankner.com/mlm-schedule.p…
    Image
    24K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Nov 11, 2024
    There have been a lot of anectodes about the Llama3 series of models being harder to post-training quanitze (PTQ) than Llama2. As part of this paper, we investigated the hypothesis that the degradation from PTQ grows with the token-to-parameter ratio (TPR), .ie as you overtrain.
    Image
    Image
    user avatar
    Tanishq Kumar
    @tanishqkumar07
    Nov 11, 2024
    [1/7] New paper alert! Heard about the BitNet hype or that Llama-3 is harder to quantize? Our new work studies both! We formulate scaling laws for precision, across both pre and post-training arxiv.org/pdf/2411.04330. TLDR; - Models become harder to post-train quantize as they
    11K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Sep 25, 2024
    SCoRe paper was a fun read (arxiv.org/abs/2409.12917)! One suprising result was that you don't effectively learn self-correction with naive multi-turn RL. Instead they do a 2 stage approach, where the first stage maximizes the reward while fixing the first-turn's ...
    arXiv logo
    arxiv.org
    Training Language Models to Self-Correct via Reinforcement Learning
    Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffective in modern LLMs. Current methods for training...
    7.9K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Jul 12, 2024
    Hydra was accepted to COLM! Going to be dropping some new perf improvements and batched decoding support as well soon 😁
    user avatar
    Zack Ankner
    @ZackAnkner
    Feb 9, 2024
    Excited to announce Hydra decoding! 🚀 We introduce sequential dependence in Medusa decoding and achieve up to a 1.31x and 2.71x improvement in throughput as compared to Medusa and baseline decoding! Paper: arxiv.org/abs/2402.05109 Github: github.com/zankner/Hydra
    Image
    GIF
    6.3K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Oct 9, 2024
    Agreed ;) But in all seriousness, its cool to see everyone converging on reward models that perform explicit reasoning by critiquing out loud. Super excited to see how people build on top of these works.
    Image
    Image
    user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Oct 8, 2024
    Imitation is the best form of flattery ;) Great to see more work on generative verifiers and reward models.
    12K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Sep 5, 2024
    Code and models for Critique-out-Loud (CLoud) reward models are finally public! The repo comes with a gradio demo you can run, so hopefully people can mess around with the models 😃 Code: github.com/zankner/CLoud
    Image
    Image
    user avatar
    Zack Ankner
    @ZackAnkner
    Aug 22, 2024
    Excited to announce our new work: Critique-out-Loud (CLoud) reward models. CLoud reward models first produce a chain of thought critique of the input before predicting a scalar reward, allowing reward models to reason explicitly instead of implicitly! arxiv.org/abs/2408.11791
    8.9K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Sep 4, 2023
    Replying to @ZackAnkner
    It’s especially hard being an undergraduate researcher who already has to balance a full-time class schedule while trying to fit in research. But instead of just complaining I would like to highlight the paper because I am proud of the work we did.
    3.5K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Oct 6, 2024
    Heading to @COLM_conf to present our work Hydra! Would love to meet people there so please DM me if you want to chat about reward models, verifiers, ML sys, inference time compute ... and honestly anything else.
    Image
    2K
  • user avatar
    Zack Ankner
    @ZackAnkner
    Jun 3, 2024
    Super cool to see our work picked up by @_akhaliq as someone who has been reading the daily paper dumps for a while 😁! Can also find my summary of the paper here: x.com/ZackAnkner/sta…
    user avatar
    AK
    @_akhaliq
    Jun 3, 2024
    Perplexed by Perplexity Perplexity-Based Data Pruning With Small Reference Models In this work, we investigate whether small language models can determine high-quality subsets of large-scale text datasets that improve the performance of larger language
    Image
    15K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement