What is this?
haveibeenfiltered is an npm package for checking passwords against breach datasets locally. It uses ribbon filters to compress 2 billion SHA-1 hashes from Have I Been Pwned into a 1.79 GB file with a 0.78% false positive rate and zero false negatives.
Install & Usage
Install
npm install haveibeenfiltered
Download the filter data
npx haveibeenfiltered download
Use in your app
const hbf = require('haveibeenfiltered')
const filter = await hbf.load()
filter.check('password123') // true — breached
filter.check('Tr0ub4dor&3') // false — not found
filter.checkHash('5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8') // true
filter.close() // release memory when done
How Ribbon Filters Work
A ribbon filter is a probabilistic data structure for set membership testing, similar to a Bloom filter but roughly 20% more space-efficient. Given a set of keys, it stores compact fingerprints in a system of linear equations that can be solved via Gaussian elimination. Querying evaluates the system for a given key and compares the result to the expected fingerprint.
Query flow
The filter file is divided into 256 shards based on the first byte of the SHA-1 hash. Within each shard, the MurmurHash3 (x64, 128-bit) of the hex hash is computed using a seed from the file header. This produces three values:
- start — index into the solution array
- coefficient — a 64-bit bitmask selecting which solution rows to XOR (ribbon width = 64)
- fingerprint — a 7-bit value to compare against
The query XORs the solution entries at positions where the coefficient bits are set, starting from the start index. If the result matches the fingerprint, the key is considered present. A small number of keys that couldn't be inserted during construction ("bumped" keys) are stored in a per-shard overflow table and checked separately.
Benchmarks
| Metric | Value |
|---|---|
| HIBP passwords | 2,048,908,128 |
| Filter size | 1.79 GiB |
| Fingerprint bits | 7 |
| False positive rate | 1/128 (~0.78%) |
| False negatives | 0 |
check(password) | ~14 µs |
checkHash(sha1hex) | ~8 µs |
| Throughput (single core) | ~72k–121k/sec |
| npm dependencies | 0 |
Comparison with HIBP API
| haveibeenfiltered | HIBP API | |
|---|---|---|
| Privacy | Full — nothing leaves machine | k-anonymity (5-char SHA-1 prefix sent) |
| Speed | ~14 µs/check | ~200 ms/request |
| Offline | Yes | No |
| Setup | 1.79 GB download | None |
| RAM | ~1.8 GB | 0 |
| False positives | ~0.78% | 0% (exact) |
| Rate limits | None | Yes |
| Data freshness | Static snapshot | Continuously updated |
Downloads
Filter binaries are hosted on Cloudflare R2 at https://files.haveibeenfiltered.com/v0.1/.
| File | Size | SHA-256 |
|---|---|---|
ribbon-hibp-v1.bin |
1.79 GiB | 4eeb8608fa8541a51a952ecda91ad2f86e6f7457b0dbe34b88ba8a7ed33750ce |
ribbon-hibp-v1-min5.bin |
726 MB | 4422f5659cb5fe39cf284b844328bfd3f7ab37fac0fe649b4cff216ffd2ac5da |
ribbon-hibp-v1-min10.bin |
435 MB | 8c71d6a3696d27bcf21a30ddcd67f7e290a71210800db86810ffb84a426fe93e |
ribbon-hibp-v1-min20.bin |
259 MB | 31a2c7942698fce74d95ce54dfb61f383ef1a33dce496b88c672e1ac07c71c43 |
ribbon-rockyou-v1.bin |
12.8 MB | 777d3c1640e7067bc7fb222488199c3371de5360639561f1f082db6b7c16a447 |
ribbon-top1m-v1.bin |
0.9 MB | 44f03ee81d777b42ba96deabde394f8aca8b8ef99134e15121c4e0c3fb3547c1 |
ribbon-top10m-v1.bin |
9.0 MB | bdc40e88abf427354d408d67e79a31f7e2987dac0f1130c4d30f396062a9cd96 |
Integrity is verified via SHA-256 after download. The CLI handles downloading automatically:
npx haveibeenfiltered download
False Positives & False Negatives
A ribbon filter is a probabilistic data structure. It has two possible error types:
- False positive (FP): A safe password is incorrectly reported as breached. This can happen because the filter stores compressed fingerprints, not exact hashes. The filter uses 7-bit fingerprints, giving a rate of 1/128 (~0.78%).
- False negative (FN): A breached password is missed and reported as safe. This cannot happen. If a password is in the dataset, the filter will always detect it.
In practice this means: if check() returns false, the password is definitely not in the dataset. If it returns true, there is a ~0.78% chance it's a false alarm. For security applications this is the right tradeoff — you never miss a breached password.
FAQ
Can I use this in production?
Yes. The package has zero npm dependencies and uses only Node.js builtins (crypto, fs, path, https). The filter is loaded into memory once and all lookups are in-memory array operations. There is no network I/O during checks.
How often is the data updated?
The filter files are static snapshots. The current HIBP dataset was built from the Have I Been Pwned password list containing 2,048,908,128 passwords. Updated filter files will be released as new versions when the source data is updated.
What about false positives?
The filter uses 7-bit fingerprints, giving a theoretical false positive rate of 1/128 (~0.78%). This means roughly 1 in 128 passwords not in the breach dataset will incorrectly report as breached. There are zero false negatives — every breached password is always detected.
How much RAM does it need?
The filter is loaded entirely into memory. The HIBP dataset requires ~1.8 GB of RAM. Smaller datasets need less: top10m ~9 MB, top1m ~1 MB, RockYou ~13 MB. Memory is released when you call filter.close().
Does it phone home?
No. All password checking is fully local. The only network request the package ever makes is downloading the filter file, and only when you explicitly run npx haveibeenfiltered download or opt in with autoDownload: true. Downloads use HTTPS only and refuse redirects.