Project Overview: Deep Dive Duel
Inspiration
Our project was born from the Wikipedia Game, a classic digital challenge where players race to find a path between two completely unrelated topics using only internal links. We were inspired to transform this human pastime into a high-stakes AI Benchmarking Arena. We wanted to see how different "brains" (LLMs) and "strategies" (algorithms) could navigate the vast, interconnected graph of the web when the connection is not obvious. Seeing the logic play out visually became our core mission.
The Algorithms: BFS vs. DFS
At the heart of our arena is the comparison between two fundamental computer science traversals, each given a distinct AI persona:
- Breadth-First Search (BFS), "The Spider": This agent scans all sibling links at the current depth before moving deeper. It is designed to find the shortest path, expanding outward like a spider web from the starting point.
- Depth-First Search (DFS), "The Deep Diver": This agent follows a single thread as deep as possible, only backtracking when it hits a dead end or reaches its maximum depth limit. Visually, this creates long, thin tendrils stretching across the map as it pursues a single line of reasoning.
Challenges Faced
Building an autonomous navigator presented several technical and conceptual hurdles:
Creating Proper Algorithms: One of our biggest technical hurdles was optimizing standard graph traversal algorithms for an LLM context. We found that a naive Depth-First Search (DFS) suffered from severe semantic tunnel vision, where the agent would confidently commit to a sub-optimal path and hit our depth limit immediately. To address this, we engineered a custom "Soft DFS" algorithm with a backtracking stack. Instead of making a single greedy choice, we optimized our prompts to force the LLM to rank its top three candidates, injecting them into the queue in reverse order. This allowed the agent to prioritize the strongest path while retaining fallback options if it reached a dead end. We also overhauled our scraping logic to preserve link order, ensuring that important introductory links were not removed from the context window before the LLM could evaluate them.
Selecting the Correct LLM: We found that not all large language models are equally effective at reasoning through complex link structures. Some models relied too heavily on surface-level similarity rather than true contextual relevance, which led to inefficient or misleading navigation. Through experimentation, we identified models with stronger contextual understanding and decision-making abilities that were better suited for this task.
What We Learned
Through this project, we learned that strong algorithms matter just as much as strong models. Even highly capable LLMs struggle without a clear traversal strategy. We also gained experience designing effective visualizations, using force-directed graphs to make search behavior and decision paths easy to understand. Overall, the project helped us connect core computer science concepts with practical, interactive AI systems.
Built With
- fastapi
- forcegraph2d
- javascript
- langgraph
- openrouter
- python
- react
- tailwind
Log in or sign up for Devpost to join the conversation.