Pull Request Reviewer

A automated Pull Request Review

Inspiration

I drew inspiration for this project from the need to enhance the code review process within Bitbucket. I realized that traditional code reviews can be quite time-consuming, often making it challenging for developers to quickly grasp code changes, identify issues, and provide valuable suggestions. My goal was to streamline and improve the code review experience, making it more efficient and effective.

What it does

My project, the Pull Request Reviewer, is a Bitbucket Forge App designed to simplify and enhance the code review process. It automatically analyzes the pull request's diff and gives developers insights into the proposed changes. These insights include:

Explaining what the changes do
Scanning for bad code, vulnerabilities and secrets
Gives suggestions on how to improve the code

How I built it

Using the Bitbucket Forge platform (EAP), I created an app which hooks into repositories and listens for new pull requests. When a new pull request is created, the app fetches the diff of the code changes through Bitbucket's Rest API. The app then splits the diff per file. For each file, the app asynchronously sends out a request to OpenAI's ChatGPT Rest API to help explain the changes. To avoid running into the 25s max time for forge apps, a comment for each processed file is posted. The comments are replies to an initial comment by the app to keep the comments tidy.

After all files are processed, or the timeout is reached, the initial comment is updated with the status of the processing, letting the developer know of the status of the automated review.

Challenges I ran into

While developing the Pull Request Reviewer, I encountered several challenges:

OpenAI

The max token length of OpenAI's models. Currently, I only have access to a context of 32k tokens in the ChatGPT3 model. This is not enough for a large pull request. I had to decide to process each file separately, which removes context for a thorough analysis.
Given the same input to ChatGPT can lead to different answers. This makes it a bit harder to trust in the quality of the answer. e.g. sometimes it proposes changes to enhance the code, while other times it says the code cannot be improved further.
The answer of ChatGPT is not always formatted the same. Trying to have a "corporate identity" for the app is a hard problem. The direct markdown output of ChatGPT differs in quality and does not always look the same.
large diffs lead to long processing times. This leads to possible timeouts for the forge app

Forge and Bitbucket:

The time limit for processing makes it hard to process large diffs or diffs as a whole.
The format to decorate pull requests, e.g. with a card of my own, is not well documented, and I had to fall back to comments.
Listening on the avi:bitbucket:created:pullrequest created event invokes the forge app multiple times with the same ID. As an improvement, the app should store its previous runs to avoid running multiple times for the same commit.

I'm proud of the following accomplishments:

Successfully creating a tool that significantly enhances the code review process within Bitbucket, improving developer productivity. Achieving a good level of code analysis accuracy, resulting in valuable and actionable suggestions. Seamlessly integrating my tool with Bitbucket, making it easy for teams to adopt and use. The App mostly acts as a normal developer. Receiving positive feedback from early testers at my company and witnessing a measurable in review efficiency.

What I learned

I strengthened my skills with Rest APIs and JavaScript. This hackathon also showed me the vast possibilities AI can provide in the near future. It also helped me understand the limitations one may face with the current large language models.

What's next for Pull Request Reviewer

Besides winning this Hackathon ;-) , I have some plans to further improve this app:

Give the possibility to parse the whole pull request as one and summarize the findings in a single comment or card
Improve the styling of the comment to have a better experience
Have some kind of cache/database in the backend to avoid running multiple times (e.g. on pull request updates without code changes)
Continuously gathering user feedback and refining the tool based on real-world usage.

Built With

Updates

Thomas Willems posted an update — Oct 24, 2023 06:23 PM EDT

I updated it, to update a single comment instead of posting one comment per analyzed file. This makes the implementation a bit cleaner.

I also experimented with different OpenAI models and made them configurable through an environment variable. You should note that the newer and larger models take more time to process a single diff.

Log in or sign up for Devpost to join the conversation.

Thomas Willems started this project — Oct 24, 2023 06:20 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.