ValidO | Devpost

Description

Our project is an ML-based system designed to evaluate the quality and relevancy of Google Local Reviews. The specific problem we address is the challenge of automatically detecting low-quality reviews—such as advertisements, irrelevant content, or rants—while preserving valid, informative reviews. This directly tackles the problem of ensuring trustworthiness and reliability in location-based reviews, which are often cluttered with spam or misleading content.

The system works by:

Classifying reviews into four categories: Ad, Rant, Irrelevant, and Valid.
Feature engineering to capture useful signals, such as review length, all-caps ratio (to detect rants), and a heuristic relevancy score (longer reviews with higher sentiment and lower caps ratio are more trustworthy).
Labeling data using a combination of Qwen with few-shot prompting and manual hand-labeling of 1000 and 200 reviews respectively to improve data quality.
To capture emotions effectively, we trained RoBERTa on the GoEmotions dataset and leveraged it for feature extraction.
Training a Logistic Regression to classify reviews into categories.
Evaluating performance with precision, recall, and F1-score to measure how well the system identifies different types of low-quality reviews.
This solution helps platforms enforce content policies and provides more reliable information for users making location-based decisions.

Tools Used

Development tools: Jupyter Notebook, Google Colab, VSCode.
APIs: Qwen LLM using Hugging Face
Libraries and frameworks: scikit-learn, scipy, spacy, pandas, matplotlib, seaborn, Hugging Face transformers
Dataset: Google Local Reviews dataset, GoEmotions dataset

Relevance

This project demonstrates how machine learning can improve trust in review systems by filtering out misleading or low-quality content. It shows that even with simple features like review length, sentiment, and capitalization ratios, we can capture strong signals of review quality. Our approach highlights both the potential and limitations of lightweight models, and sets the stage for future improvements using larger pre-trained models and richer metadata.