A StrongREJECT for Empty Jailbreaks
Alexandra Souly*, Qingyuan Lu*, Dillon Bowen*, Tu Trinh†, Elvis Hsieh†, Sana Pandey, Pieter Abbeel, Justin Svegliato, Scott Emmons, Olivia Watkins, Sam Toyer
NeurIPS Datasets and Benchmarks 2024 Code |
Documentation |
OpenAI o1 eval
Misc
I play tennis, basketball, table tennis, and poker.