haveibeenfiltered

Offline password breach checking using ribbon filters

npm install haveibeenfiltered

What is this?

haveibeenfiltered is an npm package for checking passwords against breach datasets locally. It uses ribbon filters to compress 2 billion SHA-1 hashes from Have I Been Pwned into a 1.79 GB file with a 0.78% false positive rate and zero false negatives.

Install & Usage

Install

npm install haveibeenfiltered

Download the filter data

npx haveibeenfiltered download

Use in your app

const hbf = require('haveibeenfiltered')

const filter = await hbf.load()

filter.check('password123')   // true  — breached
filter.check('Tr0ub4dor&3')   // false — not found

filter.checkHash('5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8')  // true

filter.close()  // release memory when done

How Ribbon Filters Work

A ribbon filter is a probabilistic data structure for set membership testing, similar to a Bloom filter but roughly 20% more space-efficient. Given a set of keys, it stores compact fingerprints in a system of linear equations that can be solved via Gaussian elimination. Querying evaluates the system for a given key and compares the result to the expected fingerprint.

Query flow

password SHA-1 1st byte shard 256 total MurmurHash3 x64 128-bit, seeded start coeff 64-bit fprint 7-bit XOR solution[i] where bit set match ? yes: found (or check overflow) no: not in set

The filter file is divided into 256 shards based on the first byte of the SHA-1 hash. Within each shard, the MurmurHash3 (x64, 128-bit) of the hex hash is computed using a seed from the file header. This produces three values:

The query XORs the solution entries at positions where the coefficient bits are set, starting from the start index. If the result matches the fingerprint, the key is considered present. A small number of keys that couldn't be inserted during construction ("bumped" keys) are stored in a per-shard overflow table and checked separately.

Benchmarks

MetricValue
HIBP passwords2,048,908,128
Filter size1.79 GiB
Fingerprint bits7
False positive rate1/128 (~0.78%)
False negatives0
check(password)~14 µs
checkHash(sha1hex)~8 µs
Throughput (single core)~72k–121k/sec
npm dependencies0

Comparison with HIBP API

haveibeenfilteredHIBP API
PrivacyFull — nothing leaves machinek-anonymity (5-char SHA-1 prefix sent)
Speed~14 µs/check~200 ms/request
OfflineYesNo
Setup1.79 GB downloadNone
RAM~1.8 GB0
False positives~0.78%0% (exact)
Rate limitsNoneYes
Data freshnessStatic snapshotContinuously updated

Downloads

Filter binaries are hosted on Cloudflare R2 at https://files.haveibeenfiltered.com/v0.1/.

FileSizeSHA-256
ribbon-hibp-v1.bin 1.79 GiB 4eeb8608fa8541a51a952ecda91ad2f86e6f7457b0dbe34b88ba8a7ed33750ce
ribbon-hibp-v1-min5.bin 726 MB 4422f5659cb5fe39cf284b844328bfd3f7ab37fac0fe649b4cff216ffd2ac5da
ribbon-hibp-v1-min10.bin 435 MB 8c71d6a3696d27bcf21a30ddcd67f7e290a71210800db86810ffb84a426fe93e
ribbon-hibp-v1-min20.bin 259 MB 31a2c7942698fce74d95ce54dfb61f383ef1a33dce496b88c672e1ac07c71c43
ribbon-rockyou-v1.bin 12.8 MB 777d3c1640e7067bc7fb222488199c3371de5360639561f1f082db6b7c16a447
ribbon-top1m-v1.bin 0.9 MB 44f03ee81d777b42ba96deabde394f8aca8b8ef99134e15121c4e0c3fb3547c1
ribbon-top10m-v1.bin 9.0 MB bdc40e88abf427354d408d67e79a31f7e2987dac0f1130c4d30f396062a9cd96

Integrity is verified via SHA-256 after download. The CLI handles downloading automatically:

npx haveibeenfiltered download

False Positives & False Negatives

A ribbon filter is a probabilistic data structure. It has two possible error types:

In practice this means: if check() returns false, the password is definitely not in the dataset. If it returns true, there is a ~0.78% chance it's a false alarm. For security applications this is the right tradeoff — you never miss a breached password.

FAQ

Can I use this in production?

Yes. The package has zero npm dependencies and uses only Node.js builtins (crypto, fs, path, https). The filter is loaded into memory once and all lookups are in-memory array operations. There is no network I/O during checks.

How often is the data updated?

The filter files are static snapshots. The current HIBP dataset was built from the Have I Been Pwned password list containing 2,048,908,128 passwords. Updated filter files will be released as new versions when the source data is updated.

What about false positives?

The filter uses 7-bit fingerprints, giving a theoretical false positive rate of 1/128 (~0.78%). This means roughly 1 in 128 passwords not in the breach dataset will incorrectly report as breached. There are zero false negatives — every breached password is always detected.

How much RAM does it need?

The filter is loaded entirely into memory. The HIBP dataset requires ~1.8 GB of RAM. Smaller datasets need less: top10m ~9 MB, top1m ~1 MB, RockYou ~13 MB. Memory is released when you call filter.close().

Does it phone home?

No. All password checking is fully local. The only network request the package ever makes is downloading the filter file, and only when you explicitly run npx haveibeenfiltered download or opt in with autoDownload: true. Downloads use HTTPS only and refuse redirects.