Superlinked, Inc. reposted this
Every re-ranking call costs you 150ms and $0.002. Your search handles 1M queries/day. Do the math. 💸 💸 The Re-Ranking Tax: ▪️Latency: 150ms per query (cross-encoder inference) ▪️ Cost: $0.002 per query (GPU compute for scoring 100 doc pairs) ▪️ Scale: 1M queries/day Daily cost: $2,000 Monthly cost: $60,000 Annual cost: $730,000 📈 And that's just compute. Add: ▪️ Infrastructure maintenance (re-ranker deployment) ▪️ Engineering time (blending metadata post-search) ▪️ User drop-off (every 100ms latency = 1% conversion loss) Why You're Paying This Tax: Initial retrieval is weak. Text embeddings only, price/rating signals missing. Re-ranking tries to fix it post-hoc. But if relevant docs aren't in top 100, re-ranker never sees them. ✅ The Alternative: Encode signals at index time: ▪️Text similarity + price optimization + rating maximization ▪️ Hard filters before search (eliminate irrelevant items) ▪️ Dynamic query-time weights (no re-embedding needed) Results: ▪️ Cost: $0 re-ranking (eliminated) ▪️ Latency: 50ms (4x faster) ▪️ Accuracy: Higher (relevant results in initial retrieval) TCO: $730K/year → $0 Re-ranking isn't a feature. It's a tax on bad retrieval architecture. Full cost breakdown + architecture comparison 👇