Skip to content

omshukla24/Phantom-Mod

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔎 Phantom

Catches reworded reposts and karma-farm reuploads on Reddit — even when titles are rewritten — using Locality-Sensitive Hashing entirely inside Devvit's native Redis. No external service, no vector DB.


📌 Overview

Phantom is a high-performance, fully native Reddit moderation tool built on the Devvit Developer Platform. Reposts and duplicate uploads are the #1 moderation time-sink on Reddit. Karma-farming accounts frequently rephrase or rewrite post titles to evade standard text detectors.

Phantom solves this by using Locality-Sensitive Hashing (LSH) and SimHash. Rather than using fragile exact-string matches or relying on costly external vector databases, Phantom runs 100% natively inside Devvit's serverless environment, querying and index-scanning duplicates in milliseconds using Devvit's native Redis sorted sets.


🏗️ Architecture & Workflow Diagram

Below is the workflow and execution lifecycle when a post is submitted to the subreddit:

sequenceDiagram
    autonumber
    actor User as Submitter
    participant Reddit as Reddit Core API
    participant Phantom as Phantom Trigger
    participant Redis as Devvit Redis (Native)
    participant Mod as Mod Queue / Modmail

    User->>Reddit: Submits new Post
    Reddit->>Phantom: Fires PostSubmit Event
    Phantom->>Redis: Check alreadySeen (NX Set)
    Redis-->>Phantom: Returns false
    
    Phantom->>Phantom: Normalize text (remove URLs, strip punctuation)
    Phantom->>Phantom: Generate weighted character 3-/4-gram shingles
    Phantom->>Phantom: Compute 64-bit SimHash
    Phantom->>Redis: Increment Telemetry Scans
    
    Phantom->>Phantom: Extract 8 Band Keys (8 bits each)
    Phantom->>Redis: Query candidates matching bands in lookback window
    Redis-->>Phantom: Returns candidate list
    
    loop For each Candidate
        Phantom->>Redis: Get Candidate Metadata & Hash
        Redis-->>Phantom: Returns Candidate Record
        Phantom->>Phantom: Calculate Hamming Distance (bit popcount)
    end
    
    alt Hamming Distance <= Auto-Remove Threshold (default <= 5)
        Phantom->>Reddit: Auto-remove post
        Phantom->>Reddit: Post stickied bot comment (Removed)
        Phantom->>Redis: Increment Auto-Removal Telemetry
        Phantom->>Redis: Increment Author Dupe Score
    else Hamming Distance <= Report Threshold (default <= 15)
        Phantom->>Reddit: Report post to mod queue
        Phantom->>Reddit: Post stickied bot comment (Reported)
        Phantom->>Redis: Increment Mod-Report Telemetry
        Phantom->>Redis: Increment Author Dupe Score
    end
    
    opt Author Dupe Score >= Escalation Count (default 3)
        Phantom->>Reddit: Send Modmail Alert (Repeat Offender)
        Phantom->>Redis: Increment Modmail Telemetry
    end

    Phantom->>Redis: Index post (Hash & Band Sorted Sets)
    Phantom->>Reddit: Done
Loading

⭐ Features

Feature Description
Real-time detection Runs instantly on every new submission using serverless Devvit hooks.
SimHash Matching Character-level 3-gram and 4-gram shingles ensure robustness against typos, rewordings, and synonyms.
Locality-Sensitive Hashing Uses the LSH banding trick (8 bands of 8 bits) to query candidates in near-$O(1)$ time, avoiding slow full-database scans.
Tiered Auto-Actions Auto-deletes high-confidence duplicates and reports medium-confidence matches to the mod queue.
Sticky Moderation Comments Posts a stickied, distinguished bot comment on duplicates detailing similarity scores and linking directly to the original post.
Modmail Escalation Monitors repeat duplicate offenders and triggers automated modmail alerts if they exceed the configured count.
Interactive Dashboard Renders a gorgeous live custom post featuring Scans, Caught Duplicates, dynamic SVG histograms of Hamming distance distributions, and top repeat offenders.
Administration Menus Mod-only options to manually find duplicates for any post, check subreddit stats, or deploy the interactive dashboard.
Hourly Retention Sweeper Cron job sweeps expired posts from Redis hashes and LSH indices, keeping memory usage constant and bounded.
Installation Backfill Auto-indexes the last 100 posts upon initial app install so it goes live with active history immediately.

⚙️ App Settings & Configuration

Moderators can easily configure the sensitivity of Phantom from the Reddit App Settings panel:

Setting Key Type Default Description
textThreshold Number 15 Report threshold (Hamming bit distance; lower is stricter). Matches at or below this value are reported.
autoRemoveThreshold Number 5 Auto-remove threshold. Matches at or below this distance are auto-deleted. (Set to 0 to disable auto-removals).
modmailEscalationCount Number 3 Offender limit. Number of caught duplicates by the same author before triggering a modmail ban alert.
lookbackDays Number 30 Lookback window (in days) to compare submissions against.
ignoreCrossposts Boolean true Skip analyzing cross-posts from other subreddits.

🚀 Getting Started

Prerequisites

  • Node.js (v18+)
  • npm
  • Devvit CLI installed and configured (npm install -g @devvit/cli)

Installation

  1. Clone the repository:
    git clone https://github.com/omshukla24/Phantom-Mod.git
    cd Phantom-Mod
  2. Install dependencies:
    npm install
  3. Log in to your Devvit account:
    devvit login
  4. Build the application:
    npm run build
  5. Playtest/deploy the app to your moderated subreddit:
    devvit playtest <subreddit-name>

📊 Project Structure

├── src/
│   ├── handlers/
│   │   ├── dashboard.tsx       # Live interactive custom post & statistics UI
│   │   ├── findDuplicates.ts   # Context menu action for manual duplicate checking
│   │   ├── onPostSubmit.ts     # Main submit trigger pipeline & enforcement actions
│   │   ├── retention.ts        # Hourly cron job retention sweeper & install backfill
│   │   └── statsMenu.ts        # Context menu action displaying raw telemetry
│   ├── lib/
│   │   ├── banding.ts          # LSH 8-bit split banding logic
│   │   ├── hamming.ts          # Hamming distance popcount & similarity calculations
│   │   ├── normalize.ts        # Regex text normalization & shingle generator
│   │   ├── simhash.ts          # FNV-1a weight-accumulating SimHash
│   │   └── store.ts            # Redis database key schemas, getters, and setters
│   ├── main.ts                 # Devvit configuration and handlers import
│   └── settings.ts             # App configuration schema definitions
├── tests/                      # Algorithmic test suite
└── package.json

🛡️ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors