If you're building with the AI SDK and want to add RAG to it - we wrote about that in the blog post below:
About us
The RAG platform built for scale: built-in citations, deep research, 22+ file formats, partitions, MCP server, and more.
- Website
-
https://agentset.ai
External link for Agentset
- Industry
- Software Development
- Company size
- 2-10 employees
- Type
- Privately Held
- Founded
- 2025
Employees at Agentset
Updates
-
ZeroEntropy recently released zembed-1. We added it to our embedding leaderboard. Key takeaways: – Now #1 on our leaderboard – Wins 55–80% of head-to-head matchups across 16 models (OpenAI, Voyage, Cohere, etc.) – Strongest on general document retrieval and multilingual queries Impressive work by the ZeroEntropy (YC W25) team! Read the full breakdown in the post below.
-
-
We looked into how to detect hallucinations in RAG. We tested LLM judges, atomic claim verification, and encoder-based NLI. Even with correct retrieval, models can still produce confident but unsupported answers. Each approach trades off accuracy, latency, and cost. Encoder-based NLI turned out to be the most practical option for production, with some important caveats. Full write-up in the link below:
-
-
A lot of RAG issues we keep seeing (hallucinations, weak citations, unclear answers) often come down to prompt constraints. So we collected a set of 𝗥𝗔𝗚 𝗣𝗿𝗼𝗺𝗽𝘁 𝗧𝗲𝗺𝗽𝗹𝗮𝘁𝗲𝘀 that reflect common patterns used to work through these problems. Templates are copy-pasteable, with upvote / downvote so what works surfaces over time. You can also contribute prompts that have worked well for you. Link in the comments 👇
-
-
We curated a list of 𝗮𝘄𝗲𝘀𝗼𝗺𝗲 𝗿𝗲𝗿𝗮𝗻𝗸𝗲𝗿𝘀. It includes reranking models, libraries, benchmarks, and integrations (plus some more useful resources). When we started working on reranking - and honestly, still today - the information was scattered across docs, papers, and blog posts. To save others some time, we pulled what we found into one place. Check out the link for the list below 🔗 👇
-
-
We benchmarked Cohere’s new Rerank 4 (Pro + Fast) against v3.5 and our top rerankers. Key takeaways: – Pro jumped from lower half of the stack to #2 overall (right behind zerank-2) – It’s especially strong on business reports + finance Q&A – Pro stays under 1s but is ~2× slower than zerank-2 – Pro improved everywhere, Fast regressed on argumentation + web search Read full breakdown in the post below.
-
We plugged the newly dropped GPT-5.2 into our LLM RAG leaderboard next to GPT-5.1, Claude, Grok, Gemini, GLM, and the strongest open-source models. Here’s what stood out: • ~70% fewer tokens per answer • #1 on scientific claim verification • much more stable performance across workloads Full write-up and plots in the comments.
-