Log inSign up
Drew Breunig
15.9K posts
Image
user avatar
Drew Breunig
@dbreunig
Writing about and working on AI, DSPy, geo, and data.
Bay Area
dbreunig.com
Joined March 2008
1,148
Following
9,178
Followers
  • user avatar
    Drew Breunig
    @dbreunig
    Oct 21, 2025
    Google summarizes @NousResearch as a t-shirt company with a side business in AI.
    Image
    98K
  • user avatar
    Drew Breunig
    @dbreunig
    Jun 23, 2025
    As your context bloats, you hit different failure modes. These failures hit agents hardest because they operate in exactly the scenarios where contexts balloon: gathering information, making sequential tool calls, engaging in multi-turn reasoning, & accumulating histories.
    Image
    131K
  • user avatar
    Drew Breunig
    @dbreunig
    Oct 25, 2025
    For people wondering today, yes, Kimi K2 RL’ed writing quality.
    dbreunig.com
    How Kimi K2 RL’ed Qualitative Data to Write Better
    Our last post on Kimi K2 dives into how the Moonshot team used reinforcement learning (RL) on qualitative tasks. If you haven’t already, check out the last two explorations:
    72K
  • user avatar
    Drew Breunig
    @dbreunig
    Oct 6, 2025
    OpenAI's prompt optimizer in AgentKit is GEPA. It's gonna be everywhere...
    user avatar
    Flurin Laim
    @flurin17
    Oct 6, 2025
    Replying to @flurin17
    One of the developer actually confirmed. They use GEPA.
    90K
  • user avatar
    Drew Breunig
    @dbreunig
    Sep 22, 2025
    If the DSPy docs are confusing (I get it, no judgement!), check out my talk from Databricks' Summit. I keep getting notes from people who attended telling me it helped them finally "get" DSPy and they haven't stopped using it since.
    43K
  • user avatar
    Drew Breunig
    @dbreunig
    Nov 9, 2025
    Seems like every other week people find my writing about Kimi K2. Hats off the the @Kimi_Moonshot team for writing such a great technical paper and sharing so many details worth writing about. Here's the paper: github.com/MoonshotAI/Kim… Here's my writing on... How they rephrased
    Image
    Kimi-K2/tech_report.pdf at main · MoonshotAI/Kimi-K2
    From github.com
    32K
  • user avatar
    Drew Breunig
    @dbreunig
    Aug 23, 2025
    I got around to kicking the tires on GEPA prompt optimization in DSPy, seeing if it could match the reported gsm8k benchmark for Qwen3-4b-thinking. Started with the simplest signature: qa_bot = dspy.Predict('question -> answer') GEPA got it from 67.2% to 92.8%.
    25K
  • user avatar
    Drew Breunig
    @dbreunig
    Aug 20, 2025
    With tools like DSPy and techniques like GEPA, we're at the 1st level of prompt optimization: easily create effective prompts, for a given model, for a given task. I'm excited 2nd level of prompt optimization: constant eval collection via usage, w/ regular prompt optimization.
    19K
  • user avatar
    Drew Breunig
    @dbreunig
    Aug 21, 2025
    Here's your DeepSeek 3.1 headline: the same scores with 25-50% fewer tokens.
    Image
    20K
  • user avatar
    Drew Breunig
    @dbreunig
    Oct 18, 2024
    Simplifying the AI noise by segmenting everything into 3 big use cases: Gods, Interns, and Cogs.
    dbreunig.com
    The 3 AI Use Cases: Gods, Interns, and Cogs
    Simplifying and navigating the AI noise by segmenting everything into 3 big use cases.
    117K
  • user avatar
    Drew Breunig
    @dbreunig
    Jun 15, 2025
    Here's the write up of my Data+AI Summit talk on the perils of prompts in code and how to mitigate them with DSPy. As prompts grow in complexity, they begin to resemble programming. Don't program your prompts. Program your program.
    Image
    dbreunig.com
    Let the Model Write the Prompt
    Notes from a talk I delivered at the 2025 Data + AI Summit, detailing the problem with prompts in your code and how DSPy can make everything better.
    49K
  • user avatar
    Drew Breunig
    @dbreunig
    Jan 9, 2025
    Before you pick a model or write a prompt, build your eval. If you’re building with LLMs, your eval is your most valuable asset. It lets you test new models, iterate faster on prompts or pipelines, and ensures your product is always moving forward.
    Image
    dbreunig.com
    Your Eval is More Important Than the Model
    A well-built custom eval lets you quickly test the newest models, iterate faster when developing prompts and pipelines, and ensure you’re always moving forward against your product’s specific goal....
    12K
  • user avatar
    Drew Breunig
    @dbreunig
    Jun 23, 2025
    Replying to @dbreunig
    How Long Contexts Fail (a post on how to mitigate these issues is due this week...)
    Image
    dbreunig.com
    How Long Contexts Fail
    Taking care of your context is the key to building successful agents. Just because there’s a 1 million token context window doesn’t mean you should fill it.
    45K
  • user avatar
    Drew Breunig
    @dbreunig
    Dec 12, 2024
    Really enjoy DSPy’s workflow for LLM work. Handing off the specifics of prompt generation and engineering back to the LLM makes a lot of sense:
    dbreunig.com
    Pipelines & Prompt Optimization with DSPy
    Writing about AI, geo, culture, media, data, and the ways they interact.
    18K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement