Irreduce

Irreduce is a token-aware prompt compressor for long-context inputs. It heavily compresses text input while maintaining performance and information density.

Audience

Irreduce has a massive potential audience as is it can help tens of millions or hundreds of millions of users across all industries using search, support, agents, RAG, and any other intelligent systems incorporating LLMs

Model

The Irreduce model compresses tokens by:

Splitting the document into spans and compute BM25 (classic IR scoring) to measure how well each span matches the context
Innovatively converting text tokens into image tokens containing, compressing semantic meaning
Computing a TF‑IDF similarity between each span and the full document (coverage/centrality).
Rewarding rare terms (IDF density) and lexical diversity (information density).
Selecting spans under a token budget and grouping them into windows to ensure coverage across the document
Applying a redundancy penalty (Jaccard overlap) to avoid near‑duplicates.