Skip to content

PaperDecision/PaperDecision

Repository files navigation

ICLR 2026 Acceptance Prediction: Benchmarking Decision Process with A Multi-Agent System

Image

🔥 News

  • 2026.01.19 🌟 We’ve released ICLR 2026 predictions powered by the PaperDecision framework—check them out now!

👀 PaperDecision Overview

Academic peer review is central to research publishing, yet it remains difficult to model due to its subjectivity, dynamics, and multi-stage decision process. We introduce PaperDecision, an end-to-end framework for modeling and evaluating the peer review process with large language models (LLMs).

Our work distinguishes itself through the following key contributions:

  • An agent-based review system. We develop PaperDecision-Agent, a multi-agent framework that simulates authors, reviewers, and area chairs, enabling holistic and interpretable modeling of review dynamics.

  • A dynamic benchmark. We construct PaperDecision-Bench, a large-scale multimodal benchmark that links papers, reviews, rebuttals, and final decisions, and is continuously updated with newly released conference rounds to support forward-looking evaluation and avoid data leakage.

  • Empirical insights. We achieve up to ~82% accuracy in accept–reject prediction with frontier multimodal LLMs and identify key factors influencing acceptance outcomes, such as reviewer expertise and score changes.

Image

🤖 PaperDecision-Agent

PaperDecision-Agent models key roles in the real-world peer review process through specialized agents. It simulates structured interactions among authors, reviewers, and area chairs to capture the full decision-making workflow.

Image

📊 PaperDecision-Bench

PaperDecision-Bench is a dynamic and continually extending evaluation framework explicitly aligned with the evolving ICLR peer review process, rather than a static dataset. By grounding evaluation in future decision prediction and cross-year extension, the benchmark is inherently resistant to benchmark-specific overfitting and better reflects real-world conference usage scenarios.

To balance accessibility and realism, PaperDecision-Bench adopts a three-tier evaluation design:

  • B1: Future Prediction. Targets ICLR 2026 decision prediction, where models observe papers and reviews while final outcomes remain hidden, serving as a gold-standard test of cross-temporal generalization.

  • B2: Retrospective. Covers complete ICLR 2023–2025 data for robust retrospective evaluation, enabling reliable model comparison and systematic error analysis.

  • B3: MiniSet-1K. Provides a cost-efficient benchmark focusing on MLLM, 3D, and RL papers with ambiguous decision boundaries, supporting rapid iteration and analysis.

All data in PaperDecision-Bench are sourced from OpenReview, and the benchmark will be continuously updated as new conference rounds are released.

Image

🔮 Evaluation Pipeline

python multi_agent.py

python evaluation_metric.py

python analysis.py

✒️ Citation

If you find our work helpful for your research, please consider citing our work.

@misc{PaperDecision2026,
  author       = {Zhang, Yi-Fan and Dong, Yuhao and Zhang, Saining and Wu, Kai and Wang, Liang and Shan, Caifeng and Liu, Ziwei and He, Ran and Zhao, Hao and Fu, Chaoyou},
  title        = {ICLR 2026 Acceptance Prediction: Benchmarking Decision Process with a Multi-Agent System},
  howpublished = {\url{https://github.com/PaperDecision/PaperDecision}},
  year         = {2026},
  note         = {Accessed: 2026-01-18}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages