Ted Moskovitz (@ted_moskovitz) / X

Ted Moskovitz

314 posts

Ted Moskovitz

@ted_moskovitz

science of scaling at @AnthropicAI

London, UK

tedmoskovitz.github.io

Joined October 2019

Ted Moskovitz
@ted_moskovitz
Feb 6, 2023
Tired of ChatGPT screenshots? Miss the old days of watching RL agents walking around doing weird stuff? Look no further–I'm excited to share my @DeepMind internship project, where we develop a method to stabilize optimization in constrained RL. Link: arxiv.org/abs/2302.01275 🧵
GIF
120K
Ted Moskovitz
@ted_moskovitz
May 16, 2024
I joined the multimodal team at @AnthropicAI this week—really excited to build some cool stuff!
34K
Ted Moskovitz
@ted_moskovitz
May 18, 2020
(1/2) I’ve uploaded notes I took for the Gatsby theoretical neuro course here: tedmoskovitz.github.io/tn_guide.pdf. I hope they can be useful for anyone interested in theoretical neuroscience (or anyone who’s very bored in quarantine)!
Ted Moskovitz
@ted_moskovitz
Oct 17, 2023
Worried your LLM will produce too many paperclips? Simply tell it when to stop – excited to share our new preprint, where we introduce an approach based on constrained RL to avoid overoptimization for compound reward models: arxiv.org/abs/2310.04373 1/
55K
Ted Moskovitz
@ted_moskovitz
Jun 10, 2022
In honor of Twitter’s role as Research-LinkedIn, I’m excited to say that next week I’m starting an internship at @DeepMind working with @TZahavy along with @bodonoghue85 and the rest of the Discovery team—really looking forward to it!!
Ted Moskovitz
@ted_moskovitz
Jan 23, 2023
Very excited that our work linking dual process cognition, multitask RL, and the minimum description length principle was accepted at #ICLR2023! OpenReview link: openreview.net/forum?id=oX3tG… Very grateful for my awesome coauthors @kao_calvin, Maneesh Sahani, & Matt Botvinick
openreview.net
Minimum Description Length Control
We propose a novel framework for multitask reinforcement learning which seeks to distill shared structure among tasks into a low-complexity representation, which is then leveraged to accelerate...
9.6K
Ted Moskovitz
@ted_moskovitz
Jan 16, 2024
Really happy to say that constrained RLHF was accepted at #ICLR2024 as a spotlight! A million thanks to my coauthors @Aaditya6284, @djstrouse, Tuomas Sandholm, @rsalakhu, @ancadianadragan, and @McaleerStephen. Looking forward to seeing everyone in Vienna!
14K
Ted Moskovitz
@ted_moskovitz
Oct 15, 2020
(1/n) Ever wonder how trust regions connect to natural gradients in RL? Like the way ‘Wasserstein’ rolls off the tongue? You might like “Efficient Wasserstein Natural Gradients for Reinforcement Learning” — w/ @MichaelArbel, @fhuszar, & @ArthurGretton arxiv.org/abs/2010.05380
Ted Moskovitz
@ted_moskovitz
Jan 26, 2022
The successor rep. (SR) *sums discounted state occupancies to speed up policy eval + learning. But in naturalistic tasks, reward may only be available upon *first access to a state: openreview.net/forum?id=JBAZe… In our #ICLR2022 paper, we introduce the first-occupancy rep. (FR) 1/
openreview.net
A First-Occupancy Representation for Reinforcement Learning
Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach...
Ted Moskovitz
@ted_moskovitz
Jun 20, 2024
Really excited this is out!
Anthropic
@AnthropicAI
Jun 20, 2024
Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: claude.ai
7.5K
Ted Moskovitz
@ted_moskovitz
Oct 23, 2019
Since I’m now on Twitter, guess it’s time for an act of shameless self-promotion - have a new preprint on learning preconditioning matrices for SGD with @ruiwang2uiuc, @lanjanice, @activatedgeek, @ThomasMiconi, @jasonyo, and Aditya Rawal at @UberAILabs
arxiv.org
First-Order Preconditioning via Hypergradient Descent
Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be...
Ted Moskovitz
@ted_moskovitz
Apr 25, 2023
Happy to say ReLOAD hopped its way into #ICML2023—looking forward to seeing everyone in Hawaii!!
Ted Moskovitz
@ted_moskovitz
Feb 6, 2023
Tired of ChatGPT screenshots? Miss the old days of watching RL agents walking around doing weird stuff? Look no further–I'm excited to share my @DeepMind internship project, where we develop a method to stabilize optimization in constrained RL. Link: arxiv.org/abs/2302.01275 🧵
GIF
5.3K
Ted Moskovitz
@ted_moskovitz
Jan 21, 2022
Really happy to share that our paper “Towards an Understanding of Default Policies in Multitask Policy Optimization” was accepted as an oral at @AISTATS! Paper: arxiv.org/abs/2111.02994 Code: github.com/tedmoskovitz/t… w/ @MichaelArbel, @jparkerholder, @aldopacchiano 1/n
Ted Moskovitz
@ted_moskovitz
Jul 23, 2023
Really excited to be at #ICML2023 in Hawaii, a great place to relax, "ReLOAD," and (ofc) stand inside talking about LLMs! If you like safe, efficient RL agents that obey constraints, come by Tues. @ 5 in Hall 1 + check out our demo: colab.research.google.com/drive/10EoV2nv… Hope to see you there!
Ted Moskovitz
@ted_moskovitz
Feb 6, 2023
Tired of ChatGPT screenshots? Miss the old days of watching RL agents walking around doing weird stuff? Look no further–I'm excited to share my @DeepMind internship project, where we develop a method to stabilize optimization in constrained RL. Link: arxiv.org/abs/2302.01275 🧵
GIF
colab.research.google.com
toy_reload.ipynb
Colaboratory notebook
5.8K