GitHub - AgentAlphaAGI/Idea2Paper: Idea2Paper Offical Demo

📌 Table of Contents

📄 Idea2Paper
💬 User Community
✨ Key Features
📦 Outputs
🚀 Getting Started
🤖 Anchored Multi-Agent Review
📚 Files & Docs
🤝 Contributing & License
🙏 Credits
👥 Contributors
📑 Citation (Idea2Story)

📄 Idea2Paper

Idea2Paper is an end-to-end research agent framework that aims to systematically define and analyze the major stages of the contemporary research process, along with the core challenges inherent to each stage. Rather than treating paper writing as a monolithic generation problem, Idea2Paper explicitly decomposes scientific research into structured phases and identifies critical bottlenecks that hinder the transformation of raw ideas into coherent, submission-ready academic narratives. Through this analysis, Idea2Paper highlights that one of the most fundamental yet underexplored challenges lies in research paradigm generation—the process of converting an underspecified research idea into a logically consistent, academically grounded research story. Existing systems often struggle to produce stable and reusable research paradigms, especially when reasoning is performed entirely at runtime and under limited contextual grounding.

To address these challenges in a principled and engineering-oriented manner, Idea2Paper adopts a modular system design. Instead of immediately building a fully end-to-end writing system, the project prioritizes the construction of targeted engineering submodules that tackle specific bottlenecks in the research pipeline. As the first and core engineering submodule, Idea2Story is introduced to directly address the problem of research paradigm generation. Idea2Story focuses on transforming underspecified research ideas into complete, coherent, and submission-ready scientific narrative skeletons. By providing a structured research story as an intermediate representation, Idea2Story establishes a stable foundation for downstream stages such as method development, experiment design, and paper writing.

Idea2Paper : https://www.researchgate.net/publication/400280248_Idea2Paper_What_Should_an_End-to-End_Research_Agent_Really_Do

Idea2Story (Core Submodule of Idea2Paper)

Idea2Story introduces a pre-computation–driven framework that shifts literature understanding from runtime reasoning to offline knowledge graph construction, enabling more efficient and reliable autonomous scientific discovery.

Idea2Story : https://arxiv.org/abs/2601.20833

🧠 Core Philosophy

Knowledge-Driven: Uses ICLR data to build a comprehensive knowledge graph.
Auditable Review: Implements an anchored multi-agent review system for objective feedback.
Automated Refinement: Includes RAG deduplication and intelligent revision to enhance novelty.

Idea2Story pipeline architecture (a core module within Idea2Paper)

💬 User Community

WeChat Group	Discord Channel
	https://discord.gg/FfXtbREb

✨ Key Features

🕸️ Knowledge Graph: Built from ICLR data with Idea/Pattern/Domain/Paper nodes.
🎣 Advanced Retrieval: Three-path retrieval (Idea/Domain/Paper) with two-stage ranking (Jaccard + Embedding).
📝 Idea2Story Generation: From pattern selection to story generation, anchored review, and smart correction.
🤖 Anchored Multi-Agent Review: Uses real review statistics as anchors for relative comparisons, producing deterministic and auditable 1-10 scores.
📊 Comprehensive Logging: Per-run structured logs for full reproducibility and auditing.

📦 Outputs

📄 Paper-KG-Pipeline/output/final_story.json: Final structured Story (title/abstract/problem/method/contribs/experiments).
🔍 Paper-KG-Pipeline/output/pipeline_result.json: Full pipeline trace (reviews, corrections, audits).
📂 log/run_.../: Structured logs for every run.

🚀 Getting Started

Prerequisites

Python 3.10+

Installation

pip install -r Paper-KG-Pipeline/requirements.txt

Note: The embedding model is configurable via EMBEDDING_MODEL / EMBEDDING_API_URL (env or i2p_config.json). If you switch models, rebuild novelty/recall indexes or use model-specific index directories to avoid mismatch.
Constraint: the embedding dimension must match your index; if you switch models, rebuild indexes or use model-specific index dirs.
Recommended (auto_profile): set I2P_INDEX_DIR_MODE=auto_profile to auto-map each embedding model to its own index dirs: Paper-KG-Pipeline/output/novelty_index__{model} and .../recall_index__{model}.
Explicit I2P_NOVELTY_INDEX_DIR / I2P_RECALL_INDEX_DIR (env or i2p_config.json) override auto_profile.
Tip (speed/stability): set I2P_ANCHOR_DENSIFY_ENABLE=0 to skip Adaptive Densify; otherwise Phase 3 Critic can be much slower and may fail due to strict JSON validation.
Tip (debug): if you repeatedly hit Critic JSON errors, set I2P_CRITIC_STRICT_JSON=0 (or critic.strict_json=false) to disable strict mode and allow fallback.
Tip (LLM temperature): per-stage temperatures are configurable via I2P_LLM_TEMPERATURE_* or llm.temperature.*; defaults preserve current behavior. Critic is usually low temp for stability, while story generation can be moderate.
Tip (Idea Packaging): optional quality boost via pattern-guided idea packaging + double recall (default off). Enable with I2P_IDEA_PACKAGING_ENABLE=1 or idea.packaging_enable=true.
Tip (Subdomain taxonomy): optional quality boost for Path2 to reduce duplicated/long-tail subdomains. When enabled, the pipeline auto-detects and (if I2P_INDEX_ALLOW_BUILD=1) auto-builds subdomain_taxonomy.json under recall_index_dir (recommended: leave I2P_SUBDOMAIN_TAXONOMY_PATH empty). First build uses batched embeddings; you can also build manually via Paper-KG-Pipeline/scripts/tools/build_subdomain_taxonomy.py.
Supported (no code changes): OpenAI-compatible Embeddings APIs (/v1/embeddings) that accept input as a string or a list.
Not supported yet: DashScope “native” embeddings endpoint (/api/v1/services/embeddings/...) requires an adapter.

Dataset

👉 DATA

If you need to use the prebuilt local index, please place the two folders in paper-embedding from Hugging Face into paper-KG-Pipeline/output,

paper-KG-Pipeline/
└── output/
    ├── recall_index__{model}/
    └── novelty_index__{model}/

and make sure the embedding model matches the index you downloaded, otherwise errors may occur.

Migration note (auto_profile naming change): if you previously used provider/urlhash-based dirs, you can either (A) rename the old folders to recall_index__{model} / novelty_index__{model}, or (B) keep old folder names and set I2P_RECALL_INDEX_DIR / I2P_NOVELTY_INDEX_DIR explicitly to those paths.

Configuration

Copy .env.example to .env and fill in LLM_API_KEY (and optionally LLM_PROVIDER, LLM_BASE_URL).
(Optional) Copy i2p_config.example.json to i2p_config.json to tweak settings.

Usage

python Paper-KG-Pipeline/scripts/idea2story_pipeline.py "your research idea"

🌐 Frontend (Local Web UI)

Status: The frontend is currently unstable. We recommend running the pipeline from the terminal for now. We will improve the frontend in future updates.

Run a minimal local UI to launch the pipeline and view only high-level stage + final results (no raw logs on screen).

Start

python frontend/server/app.py --host 127.0.0.1 --port 8080

Open in your browser:

http://127.0.0.1:8080/

What you can do in the UI

Run the same pipeline entrypoint (idea2story_pipeline.py) from a web page.
Configure LLM_API_KEY, LLM_PROVIDER, LLM_BASE_URL/LLM_API_URL, LLM_MODEL for the current run (not persisted by the server).
Toggle Novelty / Verification.
Download the current run logs as a zip.

For more details, see frontend/README.md.

Output

output/
├── final_story.json # Final generated paper story
├── pipeline_result.json # Full pipeline results
└── log.json # Detailed logs

Check final_story.json for the result and pipeline_result.json for the full process.

🤖 Anchored Multi‑Agent Review

Instead of arbitrary scores, this project uses anchored comparisons. We select anchor papers with known scores, ask LLMs to compare your target against these anchors (better/tie/worse), and then deterministically fit a final numeric score. This ensures the review process is auditable and grounded in real-world data.

📚 Files & Docs

Core Code: Paper-KG-Pipeline/src/idea2paper/
Documentation:

No.	Document	Content	Target Audience
0	Project Overview	Overall architecture, core modules, parameter configuration, execution workflow	Everyone
1	Knowledge Graph Construction	Data sources, node/edge definitions, LLM enhancement, how to run	Developers
2	Retrieval System	Three-way retrieval strategies, similarity computation, performance optimization	Developers
3	Idea2Story Pipeline	Pattern selection, Idea fusion, story reflection, critic review	Developers

Review Details: MULTIAGENT_REVIEW.md

🤝 Contributing & License

We welcome PRs and Issues! Please follow the contribution guidelines. Licensed under the MIT License.

🙏 Credits

Data Source: ICLR (see KG construction docs)
Inspiration: Auditable, anchor-centered review processes.
Community Support: agentAlpha Community

👥 Contributors

📑 Citation (Idea2Story)

If you find Idea2Story useful, please cite:

@misc{xu2026idea2storyautomatedpipelinetransforming,
  title={Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives},
  author={Tengyue Xu and Zhuoyang Qian and Gaoge Liu and Li Ling and Zhentao Zhang and Biao Wu and Shuo Zhang and Ke Lu and Wei Shi and Ziqi Wang and Zheng Feng and Yan Luo and Shu Xu and Yongjin Chen and Zhibo Feng and Zhuo Chen and Bruce Yuan and Harry Wang and Kris Chen},
  year={2026},
  eprint={2601.20833},
  archivePrefix={arXiv},
  primaryClass={cs.CE},
  url={https://arxiv.org/abs/2601.20833}
}

@article{xu2026idea2paper,
  title={Idea2Paper: What Should an End-to-End Research Agent Really Do?},
  author={Xu, Tengyue and Qian, Zhuoyang and Liu, Gaoge and Zhang, Zhentao and Ling, Li and Wu, Biao and Zhang, Shuo and Lu, Ke and Shi, Wei and Wang, Ziqi and others},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.idea		.idea
Paper-KG-Pipeline		Paper-KG-Pipeline
assets/images		assets/images
frontend		frontend
papers		papers
.env.example		.env.example
.gitignore		.gitignore
AgentAlpha_Idea2paper.pdf		AgentAlpha_Idea2paper.pdf
DEV_REBUILD_END_TO_END.md		DEV_REBUILD_END_TO_END.md
DEV_REBUILD_END_TO_END_zh_CN.md		DEV_REBUILD_END_TO_END_zh_CN.md
LICENSE		LICENSE
MULTIAGENT_REVIEW.md		MULTIAGENT_REVIEW.md
MULTIAGENT_REVIEW_zh.md		MULTIAGENT_REVIEW_zh.md
README-zh_CN.md		README-zh_CN.md
README.md		README.md
RECALL_PIPELINE_TECH_SPEC_zh_CN.md		RECALL_PIPELINE_TECH_SPEC_zh_CN.md
REVIEWER_SYSTEM_QUALITY_MODE.md		REVIEWER_SYSTEM_QUALITY_MODE.md
i2p_config.json		i2p_config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📌 Table of Contents

📄 Idea2Paper

Idea2Story (Core Submodule of Idea2Paper)

🧠 Core Philosophy

💬 User Community

✨ Key Features

📦 Outputs

🚀 Getting Started

Prerequisites

Installation

Dataset

Configuration

Usage

🌐 Frontend (Local Web UI)

Start

What you can do in the UI

Output

🤖 Anchored Multi‑Agent Review

📚 Files & Docs

🤝 Contributing & License

🙏 Credits

👥 Contributors

📑 Citation (Idea2Story)

📈 Star History

About

Uh oh!

Releases 1

Packages

Contributors 9

Uh oh!

Languages

License

AgentAlphaAGI/Idea2Paper

Folders and files

Latest commit

History

Repository files navigation

📌 Table of Contents

📄 Idea2Paper

Idea2Story (Core Submodule of Idea2Paper)

🧠 Core Philosophy

💬 User Community

✨ Key Features

📦 Outputs

🚀 Getting Started

Prerequisites

Installation

Dataset

Configuration

Usage

🌐 Frontend (Local Web UI)

Start

What you can do in the UI

Output

🤖 Anchored Multi‑Agent Review

📚 Files & Docs

🤝 Contributing & License

🙏 Credits

👥 Contributors

📑 Citation (Idea2Story)

📈 Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 9

Uh oh!

Languages

Packages