We've all been there: You push your code, mark the ticket as "In Progress," and switch context to a new task. Ten minutes later—Boom. The pipeline fails.
Maybe it's a missed semicolon, a linting error, or a failed unit test. You have to stop what you're doing, context-switch back, read the logs, push a tiny fix, and wait again. It kills momentum.
We thought: "What if GitLab could fix itself?" With the rise of Agentic AI, we realized we could build a "medic" that lives in the pipeline, treating code injuries instantly.
🤖 What it does PR Medic is an autonomous AI Agent that monitors your Merge Requests.
Listens: It watches for pipeline_failed webhook events. Diagnoses: It instantly fetches the raw job logs from the failed CI job. Prescribes: Using an LLM (OpenAI GPT-4o or local Llama3 via Ollama), it analyzes the error and identifies the root cause. Operates: It generates a precise git diff patch and commits it directly to your feature branch with a descriptive message (e.g., chore(fix): resolve syntax error in validate.js). ⚙️ How we built it We built PR Medic using a modern, lightweight stack designed for speed and reliability:
Core Engine: Built with Node.js to handle webhook events and orchestration. Brain: We integrated OpenAI API (and support for local Ollama models) to act as the reasoning engine. We engineered prompts to force the LLM to output valid Git patches, not just conversational advice. Integration: We simulated the GitLab Duo Agent workflow, using mock payloads to rigorously test the "Fetch Logs -> Analyze -> Fix" loop locally. Testing: We created a simulation harness to feed fake pipeline failures (like syntax errors) and verify that the agent generates the correct fix every time. 🚧 Challenges we ran into Hallucination in Patches: Early on, the AI would suggest "pseudo-code" fixes or comments like "remove this line." We had to refine the system prompt to enforce strictly formatted git diff outputs so the patch could be applied programmatically. Log Noise: CI logs are huge. Sending the entire log to an LLM context window is expensive and slow. We implemented a smart heuristic to extract only the last 50 lines or specific "Error" blocks before sending them to the brain. 🏆 Accomplishments that we're proud of Autonomous Loop: Seeing the agent successfully detect a syntax error and "push" a valid commit without human intervention was a magic moment. Hybrid AI Support: Making it work with both top-tier cloud models (GPT-4) and local open-source models (Llama3) means it can run air-gapped if needed! 🎓 What we learned Building "Agentic" workflows is very different from building "Chatbots." The AI needs to be reliable, structured, and action-oriented. We learned a lot about prompt engineering for function calling and structured JSON outputs.
🔮 What's next for PR Medic GitLab Agent Server (KAS) Integration: Moving from a webhook bot to a fully native GitLab Agent running inside the cluster. Human-in-the-Loop: Adding a mode where the agent comments on the MR with the suggested fix, asking for a "thumbs up" button before committing.
Log in or sign up for Devpost to join the conversation.