Log inSign up
Charlie O'Neill
Baseten
851 posts
Image
user avatar
Charlie O'Neill
Baseten
@oneill_c
The sea is the sea The old man is an old man The boy is a boy and the fish is a fish The sharks are all sharks no better and no worse
San Francisco
charlesponeill.com
Joined September 2016
1,201
Following
3,795
Followers
  • Pinned
    user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Aug 28, 2025
    Today, we’re launching Parsed. We are incredibly lucky to live in a world where we stand on the shoulders of giants, first in science and now in AI. Our heroes have gotten us to this point, where we have brilliant general intelligence in our pocket. But this is a local minima. We
    Image
    00:00
    107K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Jan 7, 2025
    New preprint! In transformers, we often describe the Q/K/V maps in an ad hoc way, but we show these linear self-attention components form a "parametric endofunctor" in a 2-category of linear maps. This takes cues from @bgavran3 et al.’s programme extended to transformers. (1/10)
    33K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Aug 5, 2024
    New paper with the fantastic @christinexye (+ @jwuphysics, @kartheikiyer) : "Disentangling Dense Embeddings with Sparse Autoencoders". We present one of the first applications of SAEs to text embeddings 🧵 Paper: arxiv.org/pdf/2408.00657 Web app: huggingface.co/spaces/charlie…
    7.3K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Aug 28, 2025
    Replying to @oneill_c
    Parsed is the company I always wanted to build and the company I wish existed in the world. I’m so lucky to work with the smartest people I know, to be backed by some of the best investors and angels in the world (LocalGlobe, HuggingFace, DeepMind, NHS), and to have customers who
    Image
    Parsed | Custom, interpretable AI systems that continuously learn
    From parsed.com
    3.3K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    May 19, 2025
    With all the buzz of reasoning models, the questions I keep coming back to around RL for LLMs are: - how well does this do compared to SFTing on correct outputs of a specific task - (related to above) how much more data efficient is rl than sft (how much extra bang for buck do
    2.8K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Sep 2, 2024
    1/7 Why are we using such simple encoders in sparse autoencoders (SAEs) for LLM interpretability? The current approach uses a basic linearity-nonlinearity, but transformers are capable of much more complex computations
    4.2K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Jan 7, 2025
    Replying to @oneill_c
    @bgavran3 argue in arxiv.org/abs/2402.15332 that deep learning frameworks either impose constraints (GDL) or specify tensor ops (RNNs/Transformers). We need a single theory bridging both views - hence category theory. (2/10)
    arXiv logo
    arxiv.org
    Position: Categorical Deep Learning is an Algebraic Theory of All...
    We present our position on the elusive quest for a general-purpose framework for specifying and studying deep learning architectures. Our opinion is that the key attempts made so far lack a...
    1.7K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Feb 3, 2025
    1/5 Chris Olah’s "What is a Linear Representation? What is a Multidimensional Feature?" led me to revisit the term "one-dimensional feature." At first glance I equated this with the activation dimensionality (e.g. GPT‑2's 768), but @ch402 obviously uses it in a different sense
    Image
    1K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Jan 17, 2025
    Replying to @thesephist
    Mathpix are pretty good, maybe slightly slow (10s per 150 pages maybe?) really good with latex and mathtex formatting
    8.2K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Dec 21, 2024
    Thoughts on nihilism arising from the general pace of LM+RL research now on LessWrong, a point bolstered by release of o3 today (link below) (thanks @christinexye for reminding me of the Vonnegut passage)
    3.4K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    May 5, 2025
    1/🧵 Thrilled to announce our ICML‑accepted work (done in my undergrad days with @david_klindt and @gumr4n) that questions the linear‑only dogma in sparse autoencoders👇
    1.4K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Mar 27, 2023
    1/ 🧵 I came across a paper shared by @jeremyphoward about parallelisable linear RNNs. The ICLR 2018 paper (openreview.net/pdf?id=HyUNwul…) raises an interesting question: Could RNNs challenge Transformers as the backbone for large language models?
    9.3K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Aug 28, 2025
    Replying to @oneill_c
    @parsedlabs
    2.7K
  • user avatar
    Charlie O'Neill
    Baseten
    @oneill_c
    Apr 25, 2025
    Thought experiment: clone a trillion copies of o3 and let them self-direct research/thinking for a decade. Would even one Einstein/Newton/Ramanujan emerge? I'd price this at ~0. Competence reaches parity; paradigm-creation still shows a human/LLM delta, either fundamental or a
    6.7K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement