TikCheck | Devpost

Machine Learning Pipeline
Evaluation of Ensemble Model
Streamlit Demo

Inspiration

Online reviews guide everyday decisions, but they’re often cluttered with spam, ads, irrelevant posts, or rants. Unlike platforms with heavy filtering, TikTok’s review space is especially vulnerable to this noise. We built TikCheck to cut through the clutter and surface only the reviews that truly matter — making recommendations more trustworthy and transparent.

What it does

TikCheck is a review classification pipeline that automatically organizes reviews into five categories:

Relevant → genuine customer feedback
Irrelevant → emoji-only, off-topic, or vague comments
Advertisement → overt promotional content
Spam → repetitive or link-heavy posts
Rant → venting not tied to a real experience

The system blends rule-based detection, a deep neural network, and an ensemble layer to deliver both precision and adaptability. Users instantly see a clean snapshot of reviews: relevant ones highlighted, while spam, ads, and rants are filtered and labelled clearly.

How we built it

Defining policies
- We mapped TikTok’s review noise into five clear categories, ensuring that each type of content could be handled differently.
Data preparation
- Initial experiments with Google Local Reviews showed a strong skew toward “relevant” content.
- We used LLMs to synthesize negative examples (ads, spam, rants, irrelevant) for training balance.
- With TikTok’s own dataset, the pipeline learns directly from noisy, real-world cases — making classification more efficient and logical.
Models
- Rule-based layer: Regex, keyword lists, repetition and URL heuristics — perfect for fast capture of obvious ads and spam.
- Deep Neural Network: Lightweight, fast to train, and scalable compared to heavier models like DistilBERT. Captures nuance and metadata signals (e.g., whether a review reflects a genuine visit).
- Ensemble fusion: Weighted voting across both approaches, combining interpretability with depth for robust, high-confidence predictions.
Evaluation
- The ensemble reached ~98.03% accuracy on a balanced dataset.
- Delivered strong performance on tricky cases like subtle self-promotion or long-form rants.

Challenges we ran into

Data imbalance: The dominance of “relevant” reviews in Google data forced us to generate synthetic labels to bootstrap training.
Model trade-offs: Transformer models like DistilBERT were powerful but too heavy for hackathon-scale prototyping and real-time inference. We learned to balance efficiency and accuracy by relying on DNNs and ensembles.
Scalability: Building not just an accurate model but one that could scale to TikTok’s global user base required modular, lightweight, and adaptable design.
Edge cases: Distinguishing between genuine negative feedback and “rants” without on-site experience required careful rule design and metadata use.

Accomplishments that we're proud of

Delivering an end-to-end pipeline that runs from raw reviews to categorized outputs.
Achieving ~98% accuracy on a balanced dataset, validated against synthetic + annotated samples.
Designing a system that is modular and scalable, so TikCheck can handle millions of reviews across TikTok.
Building a lightweight demo UI where users can instantly see relevant reviews in green and noisy ones in gray.
Striking the right trade-off between speed, interpretability, and performance.

What we learned

Synthetic data generation can effectively balance skewed datasets, but only real-world noisy data (like TikTok’s) creates lasting model efficiency.
Lighter neural models (DNNs) can outperform heavier transformers in terms of speed and scalability in high-volume, production-style pipelines.
Rule-based filters could be essential for fast wins and reducing load on deeper models.
Building for a global platform means designing with scalability in mind from day one — modularity, ensemble extensibility, and efficient pipelines are just as important as accuracy.

What's next for TikCheck

Google Maps API integration → for real-time ingestion of fresh reviews at scale.
Lynx integration → embedding TikCheck directly into TikTok’s open-source UI framework to make moderation seamless for end users.
Local end-to-end deployment → running ingestion, training, and classification on TikTok’s infrastructure for efficiency, security, and scalability.
Beyond locations → expanding TikCheck to moderate other forms of user-generated content, making TikTok’s ecosystem cleaner, fairer, and more engaging.