Stories by Harshil Jani on Medium

Haversine vs OSRM: How Far Apart Are Two Places? (A Bug I Shipped)

Harshil Jani — Wed, 03 Jun 2026 01:01:12 GMT

“Her pickup point is 6.2 kms away from her address, and she wants to cancel.”

Harshil Jani — Thu, 21 May 2026 03:39:21 GMT

Previously: Shipping LLMs (Part 5/6): Where Your LLM Tokens Actually Go. I named the five silent token leaks. This piece is about the…

Harshil Jani — Wed, 20 May 2026 22:28:39 GMT

Previously: Shipping LLMs (Part 4/6): How to Evaluate a RAG Pipeline. I argued for the 90/10 RAGAS-plus-human eval rhythm. This piece is…

Harshil Jani — Sun, 17 May 2026 22:52:46 GMT

Previously: Shipping LLMs (Part 3/6): Speculative Decoding vs Quantization. I argued you should run both. This piece is about whether the…

Harshil Jani — Sun, 17 May 2026 22:35:40 GMT

Quantization fixes memory bandwidth. Speculative decoding fixes autoregression. Stack them for 3–4x cheaper LLM inference, in this order.

Harshil Jani — Sat, 16 May 2026 08:14:10 GMT

Previously: Shipping LLMs (Part 1/6): Prompt Caching vs Semantic Caching. I argued you should always prompt-cache your stable prefixes…

Harshil Jani — Sat, 16 May 2026 07:55:29 GMT

A user types a question into your AI-powered support chatbot:

Harshil Jani — Fri, 10 Apr 2026 21:06:40 GMT

I’ve been following Thariq Shihipar (@trq212) for a while now. He’s on the Claude Code team at Anthropic. He’s the guy who built the…

Harshil Jani — Fri, 13 Mar 2026 22:02:18 GMT

Every LLM which I use Claude, ChatGPT, Grok etc. has the same infuriating problem when it comes to texts on images. I ask for a clean…

Harshil Jani — Fri, 06 Mar 2026 21:00:33 GMT

I’ve killed more side projects than I’ve shipped. Beautiful READMEs, clean architectures, clever abstractions and zero users.