Pinned
Excited to release AbstentionBench -- our paper and benchmark on evaluating LLMs’ *abstention*: the skill of knowing when NOT to answer!
Key finding: reasoning LLMs struggle with unanswerable questions and hallucinate!
Details and links to paper & open source code below!
🧵1/9














