Log inSign up
hud
70 posts
user avatar
hud
@hud_evals
RL environments + evals for agents | @ycombinator | we're hiring!
hud.ai
Joined January 2025
8
Following
2,626
Followers
  • Pinned
    user avatar
    hud
    @hud_evals
    Jul 2, 2025
    we're actively hiring for these roles btw👀
    user avatar
    atlas
    BLACKBOX AI
    @creatine_cycle
    Jul 1, 2025
    the jobs left after the singularity will be: - agentic workflow engineer - twink - chief of staff
    62K
  • hud reposted
    user avatar
    Aaron Epstein
    Y Combinator
    @aaron_epstein
    Jun 18
    HUD has been on fire, being used by some of the largest companies in the world to build RL environments. Congrats to @hud_evals @jayendra_ram @parth220 @seeklis on the series A from @Standard_Cap!
    user avatar
    hud
    @hud_evals
    Jun 18
    Today, HUD is excited to share our Series A funding! We are the platform for building high quality post training datasets. Over 50 businesses use HUD to build RL environments, sell them to AI labs, or train their own models from them. Our mission is to enable a generation of
    Image
    9.5K
  • hud reposted
    user avatar
    Dalton Caldwell
    Standard Capital
    @daltonc
    Jun 18
    HUD is a very interesting company, the big idea here is to provide entrepreneurs anywhere in the world the tools and infrastructure they need to get into the data business. HUD : ScaleAI :: Airbnb : Hilton
    user avatar
    hud
    @hud_evals
    Jun 18
    Today, HUD is excited to share our Series A funding! We are the platform for building high quality post training datasets. Over 50 businesses use HUD to build RL environments, sell them to AI labs, or train their own models from them. Our mission is to enable a generation of
    Image
    28K
  • user avatar
    hud
    @hud_evals
    Jun 18
    Today, HUD is excited to share our Series A funding! We are the platform for building high quality post training datasets. Over 50 businesses use HUD to build RL environments, sell them to AI labs, or train their own models from them. Our mission is to enable a generation of
    Image
    72K
    user avatar
    hud
    @hud_evals
    Jun 18
    Come join us and celebrate this weekend at HUD's Frontier/RSI RL Environments Hackathon @ the YC HQ in San Francisco. We have an all-star cast of cosponsors such as: @ycombinator, @modal, @GoogleDeepMind, @OpenAI, @AnthropicAI , @daytonaio, @FireworksAI_HQ, @MiniMax_AI,
    Image
    7.8K
    user avatar
    hud
    @hud_evals
    Jun 18
    Here's the link to the event:
    Image
    HUD Frontier/RSI RL Environments Hackathon
    From events.ycombinator.com
    1.4K
  • user avatar
    hud
    @hud_evals
    Jun 12
    Excited to welcome @OpenAI and @arcprize as co-sponsors! HUD RL for RSI hackathon, June 20th-21st @ YC HQ Signups close tomorrow! 📢
    Image
    6.9K
    user avatar
    hud
    @hud_evals
    Jun 12
    apply here 👇
    Image
    HUD Frontier/RSI RL Environments Hackathon
    From events.ycombinator.com
    598
  • user avatar
    hud
    @hud_evals
    Jun 10
    btw, if you win our RL for RSI hackathon, u get a cool robot dog 🐕‍🦺 June 20th-21st @ YC HQ. Signups close in 3 days! 👇
    Image
    00:00
    Image
    user avatar
    hud
    @hud_evals
    May 16
    Announcing HUD's RL environments for RSI hackathon! 🎉 Join us June 20–21 in SF if you're interested in RL and want to push the frontier forward! (w/$100,000+ in prizes and compute credits 👀)
    3.1K
    user avatar
    hud
    @hud_evals
    Jun 10
    apply here by 13th June (!) 👉
    Image
    HUD Frontier/RSI RL Environments Hackathon
    From events.ycombinator.com
    307
  • user avatar
    hud
    @hud_evals
    May 16
    Announcing HUD's RL environments for RSI hackathon! 🎉 Join us June 20–21 in SF if you're interested in RL and want to push the frontier forward! (w/$100,000+ in prizes and compute credits 👀)
    Image
    70K
    user avatar
    hud
    @hud_evals
    May 16
    You can improve models at anything you can verify. The only question left: what will you teach them? Imagine what 2040 looks like. Then work backwards. Build environments and agents to push frontier in coding, ML research, robotics, manufacturing, autonomous businesses.
    2.5K
    user avatar
    hud
    @hud_evals
    May 16
    No prior RL experience required. Just ambition. Apply here → events.ycombinator.com/hud-frontier-j… Special thanks to our partners! @ycombinator, @AnthropicAI, @GoogleDeepMind, @modal, @daytonaio, @ExaAILabs, @FireworksAI_HQ, @sixtyfourai, @MiniMax_AI, @AntimLabs .
    Image
    HUD Frontier/RSI RL Environments Hackathon
    From events.ycombinator.com
    2.1K
  • user avatar
    hud
    @hud_evals
    May 9
    This Tuesday HUD is hosting Strange Evals. This session: if VLM reasoning benchmark are saturated why cant claude make me a decent PPT? DM if you’d like to join!
    user avatar
    Vincent Koc
    OpenClaw🦞
    @vincent_koc
    May 4
    For my eval-maxxing nerds out there, good friends of mine are running a series called "strange evals", you can benchmaxx now on anything. If in SF swing by! luma.com/lvqbs1mo
    1.9K
  • user avatar
    hud
    @hud_evals
    Mar 18
    AI agents are deploying to prod, but can they autonomously find and patch unseen critical vulnerabilities? We introduce ZeroDayBench, a benchmark for evaluating LLM agents on proactive cyberdefense. Plus, a novel high-severity (CVSS 8.1) CVE we found partway through ... 👀
    Image
    37K
    user avatar
    hud
    @hud_evals
    Mar 18
    Replying to @hud_evals
    While creating ZeroDayBench, a member of our team discovered CVE-2025-14279, a high-severity DNS rebinding vulnerability in the MLFlow REST server allowing full read/write access to a user’s endpoint w/o authentication. Read more on: huntr.com/bounties/ef478…
    1.3K
    user avatar
    hud
    @hud_evals
    Mar 18
    Check out the full paper by @unrelated333, @louis_sloot, @WinterCawfie as well as @Shark_Academia, @super_bavario and @jdchawla29 from @hud_evals !
    arXiv logo
    arxiv.org
    ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day...
    Large language models (LLMs) are increasingly being deployed as software engineering agents that autonomously contribute to repositories. A major benefit these agents present is their ability to...
    1.1K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement