Log inSign up
Matthew Finlayson
169 posts
Image
user avatar
Matthew Finlayson
@mattf1n
@GoodfireAI fellow, PhD at @nlp_usc | Former predoc at @allen_ai on @ai2_aristo | Harvard 2021 CS & Linguistics
Los Angeles, CA
mattf1n.github.io
Joined October 2013
939
Following
1,109
Followers
  • Pinned
    user avatar
    Matthew Finlayson
    @mattf1n
    Oct 17, 2025
    We discovered that language models leave a natural "signature" on their API outputs that's extremely hard to fake. Here's how it works 🔍 📄 arxiv.org/abs/2510.14086 1/
    arXiv logo
    arxiv.org
    Every Language Model Has a Forgery-Resistant Signature
    The ubiquity of closed-weight language models with public-facing APIs has generated interest in forensic methods, both for extracting hidden model details (e.g., parameters) and for identifying...
    17K
  • user avatar
    Matthew Finlayson
    @mattf1n
    Mar 15, 2024
    Wanna know gpt-3.5-turbo's embed size? We find a way to extract info from LLM APIs and estimate gpt-3.5-turbo’s embed size to be 4096. With the same trick we also develop 25x faster logprob extraction, audits for LLM APIs, and more! 📄 arxiv.org/abs/2403.09539 Here’s how 1/🧵
    Image
    158K
  • user avatar
    Matthew Finlayson
    @mattf1n
    Nov 9, 2022
    Can a language model help you with your math homework? Not on its own, but maybe with the help of a Python interpreter! In our EMNLP paper we present 📜 Līla and 🤖 Bhāskara, a math reasoning benchmark and model. 📄: arxiv.org/abs/2210.17517 🔗: lila.apps.allenai.org 1/🧵
    The title and authors of Lila: A Unified Benchmark for Mathematical Reasoning. The authors are Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, and Ashwin Kalyan
    An example from the Lila data set. It includes a math question, instructions, two programs that print the answer, and the answer itself.
  • user avatar
    Matthew Finlayson
    @mattf1n
    Oct 4, 2023
    Nucleus and top-k sampling are ubiquitous, but why do they work? @johnhewtt, @alkoller, @swabhz, @Ashish_S_AI and I explain the theory and give a new method to address model errors at their source (the softmax bottleneck)! 📄 arxiv.org/abs/2310.01693 🧑‍💻 github.com/mattf1n/basis-…
    Image
    33K
  • user avatar
    Matthew Finlayson
    @mattf1n
    May 24, 2022
    🎉 New pre-print 🎉 Teaching models to follow instructions is becoming popular, but what makes Instruction Learning hard in the first place? We investigate with synthetic data and build a challenge dataset! My first paper with @ai2_aristo at @allen_ai arxiv.org/abs/2204.09148
    Image
  • user avatar
    Matthew Finlayson
    @mattf1n
    Dec 15, 2022
    Applying to a UW PhD today? Did you write your SoP in LaTeX? Pro tip:
    Image
    8.3K
  • user avatar
    Matthew Finlayson
    @mattf1n
    Apr 2, 2024
    My high quality data annotations aren’t free so I will now select 1 wrong answer per captcha. It still works btw.
    Image
    2.1K
  • user avatar
    Matthew Finlayson
    @mattf1n
    Mar 15, 2024
    Replying to @mattf1n
    In a remarkable case of simultaneous discovery, a paper released earlier this week (arxiv.org/abs/2403.06634) also finds that LLM APIs leak information. We are excited for our colleagues and believe that our papers complement and strengthen one another. Amazing work! 8/8
    user avatar
    Aran Komatsuzaki
    @arankomatsuzaki
    Mar 12, 2024
    Google presents: Stealing Part of a Production Language Model - Extracts the projection matrix of OpenAI’s ada and babbage LMs for <$20 - Confirms that their hidden dim is 1024 and 2048, respectively - Also recovers the exact hidden dim size of gpt-3.5-turbo
    Image
    3.2K
  • user avatar
    Matthew Finlayson
    @mattf1n
    Dec 9, 2022
    📣 Today at EMNLP I will be giving my first-ever in-person oral presentation!!! Come hear about how we used formal languages to learn what kinds of instructions LMs can follow 🤖 Hall A-B at 2pm
    user avatar
    Matthew Finlayson
    @mattf1n
    May 24, 2022
    🎉 New pre-print 🎉 Teaching models to follow instructions is becoming popular, but what makes Instruction Learning hard in the first place? We investigate with synthetic data and build a challenge dataset! My first paper with @ai2_aristo at @allen_ai arxiv.org/abs/2204.09148
    Image
  • user avatar
    Matthew Finlayson
    @mattf1n
    Nov 9, 2022
    Replying to @mattf1n
    7/🧵 Curious about the name choices? Our benchmark is named after Līlavati, a treatise by 12th century Indian mathematician Bhāskara. Read more in our blog post. blog.allenai.org/lila-a-unified…
  • user avatar
    Matthew Finlayson
    @mattf1n
    Mar 15, 2024
    Replying to @mattf1n
    LLM outputs lie in a low-dimensional vector space (we call this the LLM’s image). By “low-dimensional”, we mean EXACTLY the embed size. To find the embed size we check the dimension of the space that the outputs span. Ok, but does this have any practical uses? 2/🧵
    Image
    2.1K
  • user avatar
    Matthew Finlayson
    @mattf1n
    Dec 9, 2022
    Come see our #EMNLP2022 poster at 4pm in the Atrium! @Swarooprm7 and I will be there to chat about math reasoning models and evaluations 🎉
    user avatar
    Matthew Finlayson
    @mattf1n
    Nov 9, 2022
    Can a language model help you with your math homework? Not on its own, but maybe with the help of a Python interpreter! In our EMNLP paper we present 📜 Līla and 🤖 Bhāskara, a math reasoning benchmark and model. 📄: arxiv.org/abs/2210.17517 🔗: lila.apps.allenai.org 1/🧵
    The title and authors of Lila: A Unified Benchmark for Mathematical Reasoning. The authors are Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, and Ashwin Kalyan
    An example from the Lila data set. It includes a math question, instructions, two programs that print the answer, and the answer itself.
  • user avatar
    Matthew Finlayson
    @mattf1n
    Nov 9, 2022
    Replying to @mattf1n
    5/🧵 We find that models perform much better when they output a Python program that prints the answer, instead of directly generating the answer. This answer format has the added benefit that it doubles as a step-by-step solution.
    Some examples of questions where the model predicts the wrong answer, but also generates a program that correctly answers the question.
  • user avatar
    Matthew Finlayson
    @mattf1n
    Jan 12, 2024
    It was really fun to contribute to this bit of software, and it has some pretty useful applications! If you want logprobs beyond the top-5 that OpenAI (or any other API) gives you, check out our library 🎉
    user avatar
    Jack Morris
    Engram
    @jxmnop
    Jan 11, 2024
    fun research story about how we jailbroke the the chatGPT API: so every time you run inference with a language model like GPT-whatever, the model outputs a full probabilities over its entire vocabulary (~50,000 tokens) but when you use their API, OpenAI hides all this info from
    1.7K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement