Log inSign up
Jesse Dodge
841 posts
user avatar
Jesse Dodge
@JesseDodge
Research Scientist at Meta. 10-yr test-of-time ACL 22, Best Demo ACL 25, Best Resource Paper ACL 24, Best Theme Paper ACL 24, Best Student Paper NAACL 15 🏳️‍🌈
jessedodge.ai
Joined March 2009
1,745
Following
3,421
Followers
  • user avatar
    Jesse Dodge
    @JesseDodge
    Dec 6, 2023
    Today Google released Gemini with a 60-page report in which they repeatedly say the training data is key ("We find that data quality is critical to a highly-performing model"), while providing almost no information about how it was made, how it was filtered, or its contents.
    Image
    183K
  • user avatar
    Jesse Dodge
    @JesseDodge
    Dec 9, 2020
    GPT-3 won a best paper award at #NeurIPS2020! Congratulations to that team, it truly is an incredible piece of work, and has changed the way many of us think about what massive LMs can do. But we should also talk about inequality in the research community -- that work couldn't...
  • user avatar
    Jesse Dodge
    @JesseDodge
    May 10, 2023
    Today Google announced PaLM 2. In their 91 page paper they repeatedly say the training data is key ("we find that the data mixture is a critical component of the final model") while providing almost no information about how it was constructed, how it was sourced, or its contents.
    Image
    138K
  • user avatar
    Jesse Dodge
    @JesseDodge
    Feb 18, 2020
    Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping arxiv.org/abs/2002.06305 We found surprisingly large variance just from random seeds when fine-tuning BERT. Both weight inits and the order of the training data have big impact. 1/n
  • user avatar
    Jesse Dodge
    @JesseDodge
    Aug 15, 2025
    Personal update: I'm excited to be joining @Meta! I'm deeply grateful for the opportunities I've had at @allen_ai over the past 6 years (including three paper awards in the last two years). Onward to the next chapter! 🥳
    38K
  • user avatar
    Jesse Dodge
    @JesseDodge
    May 25, 2022
    WE WON THE ACL 10-YEAR TEST-OF-TIME AWARD!! Ten thousand congratulations to @mmitchell_ai and our co-authors Amit Goyal, @kotymg, @karlstratos, Xufeng Han, Alyssa Mensch, Alex Berg, Tamara Berg, and @haldaume3!
    user avatar
    ACL 2026
    @aclmeeting
    May 25, 2022
    The second of the #acl2022nlp 10-year test of time awards goes to @mmitchell_ai et al. for their work on generating image descriptions published at EACL 2012 #NLProc aclanthology.org/E12-1076/
    Image
    Image
  • user avatar
    Jesse Dodge
    @JesseDodge
    Dec 6, 2023
    Replying to @JesseDodge
    This follows the trend of white papers that are written to read like research papers which don't actually contain the necessary information for basic science. This is a product, and they are purposely obscuring the most important information that makes the models work.
    23K
  • user avatar
    Jesse Dodge
    @JesseDodge
    Jun 22, 2022
    How much CO2 is emitted from training common AI models? New FAccT paper! *Partially* training a 6 B. param. transformer emits about as much as the average US home in a year! Smaller models? Only as much as charging a phone. What can you do? A 🧵: arxiv.org/pdf/2206.05229…
    Image
  • user avatar
    Jesse Dodge
    @JesseDodge
    Aug 14, 2024
    Congrats to our team for winning two paper awards at #ACL2024! OLMo won the Best Theme Paper award, and Dolma won a Best Resource Paper award! All the credit goes to the whole team for the massive group effort 🎉🎉
    Image
    Image
    Image
    Image
    53K
  • user avatar
    Jesse Dodge
    @JesseDodge
    May 8, 2020
    Successfully defended my Ph.D. under quarantine!
    You’re unable to view this Post because this account owner limits who can view their Posts. Learn more
  • user avatar
    Jesse Dodge
    @JesseDodge
    Apr 18, 2021
    Ever wonder about the web-scale data massive LMs train on? We wrote some docs for C4! cs.cmu.edu/~jessed/data_h… And we indexed it, and built an interactive demo, so you can search too: c4-search.apps.allenai.org find something cool? report it or discuss here: github.com/allenai/c4-doc…
    Image
    Image
  • user avatar
    Jesse Dodge
    @JesseDodge
    Apr 19, 2023
    The best way to understand large language models is to understand what they were trained on. Most pretraining datasets have *zero* documentation of their contents! We worked with @nitashatiku and the other WaPo journalists on this piece, check it out!
    user avatar
    Nitasha Tiku
    @nitashatiku
    Apr 19, 2023
    Replying to @nitashatiku
    Here's our analysis of the 15 million websites in just one highly-filtered CommonCrawl web scrape-used to train models like Google's T5 & Facebook's LLaMA -copyright symbol appears >200M times -pirated sites, 1 for e-books -half the top 10 = news sites washingtonpost.com/technology/int…
    44K
  • user avatar
    Jesse Dodge
    @JesseDodge
    May 10, 2023
    Replying to @JesseDodge
    Now that LLMs are products (not just research), we are at a turning point: for-profit companies will become less and less transparent *specifically* about the components that are most important. Only if the open source community can organize together can we keep up!
    7.3K
  • user avatar
    Jesse Dodge
    @JesseDodge
    Jul 1, 2021
    could not be more proud of @MaartenSap, who just *successfully defended* one of the best PhD theses I've seen. he's already had a successful career, and he's only getting started!
    Image

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement