Log inSign up
Ted Moskovitz
314 posts
Image
user avatar
Ted Moskovitz
@ted_moskovitz
science of scaling at @AnthropicAI
London, UK
tedmoskovitz.github.io
Joined October 2019
247
Following
1,107
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Feb 6, 2023
    Tired of ChatGPT screenshots? Miss the old days of watching RL agents walking around doing weird stuff? Look no further–I'm excited to share my @DeepMind internship project, where we develop a method to stabilize optimization in constrained RL. Link: arxiv.org/abs/2302.01275 🧵
    Image
    GIF
    120K
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    May 16, 2024
    I joined the multimodal team at @AnthropicAI this week—really excited to build some cool stuff!
    34K
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    May 18, 2020
    (1/2) I’ve uploaded notes I took for the Gatsby theoretical neuro course here: tedmoskovitz.github.io/tn_guide.pdf. I hope they can be useful for anyone interested in theoretical neuroscience (or anyone who’s very bored in quarantine)!
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Oct 17, 2023
    Worried your LLM will produce too many paperclips? Simply tell it when to stop – excited to share our new preprint, where we introduce an approach based on constrained RL to avoid overoptimization for compound reward models: arxiv.org/abs/2310.04373 1/
    Image
    55K
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Jun 10, 2022
    In honor of Twitter’s role as Research-LinkedIn, I’m excited to say that next week I’m starting an internship at @DeepMind working with @TZahavy along with @bodonoghue85 and the rest of the Discovery team—really looking forward to it!!
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Jan 23, 2023
    Very excited that our work linking dual process cognition, multitask RL, and the minimum description length principle was accepted at #ICLR2023! OpenReview link: openreview.net/forum?id=oX3tG… Very grateful for my awesome coauthors @kao_calvin, Maneesh Sahani, & Matt Botvinick
    openreview.net
    Minimum Description Length Control
    We propose a novel framework for multitask reinforcement learning which seeks to distill shared structure among tasks into a low-complexity representation, which is then leveraged to accelerate...
    9.6K
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Jan 16, 2024
    Really happy to say that constrained RLHF was accepted at #ICLR2024 as a spotlight! A million thanks to my coauthors @Aaditya6284, @djstrouse, Tuomas Sandholm, @rsalakhu, @ancadianadragan, and @McaleerStephen. Looking forward to seeing everyone in Vienna!
    14K
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Oct 15, 2020
    (1/n) Ever wonder how trust regions connect to natural gradients in RL? Like the way ‘Wasserstein’ rolls off the tongue? You might like “Efficient Wasserstein Natural Gradients for Reinforcement Learning” — w/ @MichaelArbel, @fhuszar, & @ArthurGretton arxiv.org/abs/2010.05380
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Jan 26, 2022
    The successor rep. (SR) *sums discounted state occupancies to speed up policy eval + learning. But in naturalistic tasks, reward may only be available upon *first access to a state: openreview.net/forum?id=JBAZe… In our #ICLR2022 paper, we introduce the first-occupancy rep. (FR) 1/
    openreview.net
    A First-Occupancy Representation for Reinforcement Learning
    Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach...
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Jun 20, 2024
    Really excited this is out!
    user avatar
    Anthropic
    @AnthropicAI
    Jun 20, 2024
    Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: claude.ai
    Benchmark table showing Claude 3.5 Sonnet outperforming (as indicated by green highlights) other AI models on graduate level reasoning, code, multilingual math, reasoning over text, and more evaluations. Models compared include Claude 3 Opus, GPT-4o, Gemini 1.5 Pro, and Llama-400b.
    7.5K
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Oct 23, 2019
    Since I’m now on Twitter, guess it’s time for an act of shameless self-promotion - have a new preprint on learning preconditioning matrices for SGD with @ruiwang2uiuc, @lanjanice, @activatedgeek, @ThomasMiconi, @jasonyo, and Aditya Rawal at @UberAILabs
    arXiv logo
    arxiv.org
    First-Order Preconditioning via Hypergradient Descent
    Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be...
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Apr 25, 2023
    Happy to say ReLOAD hopped its way into #ICML2023—looking forward to seeing everyone in Hawaii!!
    user avatar
    Ted Moskovitz
    @ted_moskovitz
    Feb 6, 2023
    Tired of ChatGPT screenshots? Miss the old days of watching RL agents walking around doing weird stuff? Look no further–I'm excited to share my @DeepMind internship project, where we develop a method to stabilize optimization in constrained RL. Link: arxiv.org/abs/2302.01275 🧵
    Image
    GIF
    5.3K
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Jan 21, 2022
    Really happy to share that our paper “Towards an Understanding of Default Policies in Multitask Policy Optimization” was accepted as an oral at @AISTATS! Paper: arxiv.org/abs/2111.02994 Code: github.com/tedmoskovitz/t… w/ @MichaelArbel, @jparkerholder, @aldopacchiano 1/n
  • user avatar
    Ted Moskovitz
    @ted_moskovitz
    Jul 23, 2023
    Really excited to be at #ICML2023 in Hawaii, a great place to relax, "ReLOAD," and (ofc) stand inside talking about LLMs! If you like safe, efficient RL agents that obey constraints, come by Tues. @ 5 in Hall 1 + check out our demo: colab.research.google.com/drive/10EoV2nv… Hope to see you there!
    user avatar
    Ted Moskovitz
    @ted_moskovitz
    Feb 6, 2023
    Tired of ChatGPT screenshots? Miss the old days of watching RL agents walking around doing weird stuff? Look no further–I'm excited to share my @DeepMind internship project, where we develop a method to stabilize optimization in constrained RL. Link: arxiv.org/abs/2302.01275 🧵
    Image
    GIF
    Colab logo
    colab.research.google.com
    toy_reload.ipynb
    Colaboratory notebook
    5.8K
Advertisement
Advertisement