RELAI (@ReliableAI) / X

RELAI

41 posts

RELAI

@ReliableAI

Self-Improving AI Agents.

Joined November 2021

RELAI
@ReliableAI
Oct 27, 2025
🚀 RELAI is live — a platform for building reliable AI agents 🔁 We complete the learning loop for agents: simulate → evaluate → optimize - Simulate with LLM personas, mocked MCP servers/tools and grounded synthetic data - Evaluate with code + LLM evaluators; turn human
00:00
12K
RELAI
@ReliableAI
Oct 29, 2024
Hi AI world ✋ We are committed to make AI reliability accessible and achievable for everyone! First up: RELAI agents for hallucination detection Chat with popular LLMs, Verify with RELAI. Try it for free today: relai.ai #AI #LLM #ReliableAI #Hallucination
relai.ai
RELAI — The continual learning engine for agents
Continual learning platform for AI agents. Turn failures and feedback into replayable environments that optimize agent performance without regressions.
37K
RELAI
@ReliableAI
Oct 31, 2024
👻 No tricks, just AI reliability! 🎃 See our hallucination detection agents in action this Halloween! Get your own access (no costume required): platform.relai.ai/auth/waitlist #HappyHalloween #AI
990
RELAI
@ReliableAI
Oct 31, 2024
Our agents are working 24/7, and more registration codes are rolling out! Try out our hallucination detection agents for free now! Request your registration code here: platform.relai.ai/auth/waitlist
1.4K
RELAI
@ReliableAI
Oct 31, 2025
🎃 Here’s a sweet Halloween treat from RELAI: We built an AI agent that maps the best trick-or-treat route for you—optimized for time, distance, candy variety, and real walking paths. 👉 Try it free: platform.relai.ai/halloween Built at RELAI.ai, where we ship
00:00
292
RELAI
@ReliableAI
Nov 15, 2024
We evaluated several hallucination detection methods on OpenAI's recently released SimpleQA benchmark. RELAI agents detected over 76% of GPT-4o's hallucinations with just a 5% false positive rate. Even more impressively, RELAI detected nearly 1/3 of GPT-4o's hallucinations with
RELAI
@ReliableAI
Nov 15, 2024
Article
RELAI Sets New State-of-the-Art for LLM Hallucination Detection
By: @Wenxiao__Wang, @siddhantbharti, @PKattakinda, @FeiziSoheil Try it out yourself: RELAI agents are accessible for individual and enterprise users at: relai.ai Summary SimpleQA...
relai.ai
RELAI — The continual learning engine for agents
Continual learning platform for AI agents. Turn failures and feedback into replayable environments that optimize agent performance without regressions.
1.4K
RELAI
@ReliableAI
Sep 8, 2025
Prompt Tuning ≠ System Tuning. Most AI agent failures are structural; we keep the agent graph frozen (modules & info flow), then wonder why agents hallucinate, misroute tools, or break guidelines. Meet Maestro: the first joint graph + config optimizer for AI agents. It
Soheil Feizi
@FeiziSoheil
Sep 8, 2025
Introducing Maestro: the holistic optimizer for AI agents. Maestro optimizes the agent graph and tunes prompts/models/tools, fixing agent failure modes that prompt-only or RL weight tuning can’t touch. Maestro outperforms leading prompt optimizers (e.g., MIPROv2, GEPA) on
2.4K
RELAI
@ReliableAI
Apr 22, 2025
Our leaderboard is now live! Check it out at: relai.ai Let us know if you want to add your model or data to our leaderboard: calendly.com/d/crx2-k7b-pcm
Soheil Feizi
@FeiziSoheil
Apr 22, 2025
🚀 RELAI Leaderboard is now live at relai.ai 🔍 Models: Currently our leaderboard shows performances for: o4-mini and GPT-4o @OpenAI @sama, Grok 2 @xai @elonmusk @ChrSzegedy, Gemini 2.0 Flash @Gemini @JeffDean, and Llama 3.3 70B @AIatMeta 📊 Benchmarks: we use
relai.ai
RELAI — The continual learning engine for agents
Continual learning platform for AI agents. Turn failures and feedback into replayable environments that optimize agent performance without regressions.
402
RELAI
@ReliableAI
Nov 21, 2024
We are hiring! Join us to make AI reliability achievable and accessible for everyone! Details below:
Soheil Feizi
@FeiziSoheil
Nov 21, 2024
Please spread the word! 🚀 Join RELAI (relai.ai) to work on cutting-edge AI projects! We have full-time and internship positions available in: - AI/ML Research & Development: lnkd.in/dEbyQz-U - Software Developers: lnkd.in/daa2jhRy - Sales
546
RELAI
@ReliableAI
Nov 15, 2024
Article
RELAI Sets New State-of-the-Art for LLM Hallucination Detection
By: @Wenxiao__Wang, @siddhantbharti, @PKattakinda, @FeiziSoheil Try it out yourself: RELAI agents are accessible for individual and enterprise users at: relai.ai Summary SimpleQA...
8.2K
RELAI
@ReliableAI
Apr 22, 2025
Vote! Quantitative results on RELAI leaderboard will be released soon! ⌛
Soheil Feizi
@FeiziSoheil
Apr 22, 2025
🤖 In your experience, LLMs struggle the most with understanding and generating code for:
1.1K
RELAI
@ReliableAI
Apr 21, 2025
🚀 Meet our Data Agents! 📅 Want your own custom benchmarks? Book a demo here: calendly.com/d/crx2-k7b-pcm
Soheil Feizi
@FeiziSoheil
Apr 21, 2025
🚀 Introducing Data Agents— generate accurate, reasoning-based AI benchmarks from your own data in minutes! ⚡ With Data Agents, we’ve created 100+ benchmarks with 100K+ samples using docs from tools like React, PyTorch, Kubernetes, LangChain, and more. 📂 All benchmarks are
375
RELAI
@ReliableAI
Jan 15, 2025
Congratulations to RELAI's founder & CEO for receiving the #PECASE award!
UMD Department of Computer Science
@umdcs
Jan 15, 2025
🎉🎉👏 Congrats to Associate Professor Soheil Feizi (@FeiziSoheil) on being honored by Pres. Biden (@POTUS) with the Presidential Early Career Award for Scientists and Engineers (PECASE)—the nation’s highest award for early-career researchers! Read more: go.umd.edu/Feizi-PECASE
960
RELAI
@ReliableAI
May 1, 2025
Our legal reasoning benchmark is live! Check it out! Let us know if you want to add your model or data to our leaderboard: calendly.com/d/crx2-k7b-pcm
Soheil Feizi
@FeiziSoheil
May 1, 2025
🚨 Releasing the SCOTUS 2024 Legal Scenarios Benchmark 🚨 We’re excited to launch a new benchmark with 200+ realistic legal dilemmas from 2024 Supreme Court slip opinions—built using RELAI Data Agents. We tested top LLMs on legal reasoning: 🥇 o4-mini — 76.4% @OpenAI @sama
440