What is this?

haveibeenfiltered is an npm package for checking passwords against breach datasets locally. It uses ribbon filters to compress 2 billion SHA-1 hashes from Have I Been Pwned into a 1.79 GB file with a 0.78% false positive rate and zero false negatives.

Install & Usage

Install

npm install haveibeenfiltered

Download the filter data

npx haveibeenfiltered download

Use in your app

const hbf = require('haveibeenfiltered')

const filter = await hbf.load()

filter.check('password123')   // true  — breached
filter.check('Tr0ub4dor&3')   // false — not found

filter.checkHash('5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8')  // true

filter.close()  // release memory when done

How Ribbon Filters Work

A ribbon filter is a probabilistic data structure for set membership testing, similar to a Bloom filter but roughly 20% more space-efficient. Given a set of keys, it stores compact fingerprints in a system of linear equations that can be solved via Gaussian elimination. Querying evaluates the system for a given key and compares the result to the expected fingerprint.

Query flow

The filter file is divided into 256 shards based on the first byte of the SHA-1 hash. Within each shard, the MurmurHash3 (x64, 128-bit) of the hex hash is computed using a seed from the file header. This produces three values:

start — index into the solution array
coefficient — a 64-bit bitmask selecting which solution rows to XOR (ribbon width = 64)
fingerprint — a 7-bit value to compare against

The query XORs the solution entries at positions where the coefficient bits are set, starting from the start index. If the result matches the fingerprint, the key is considered present. A small number of keys that couldn't be inserted during construction ("bumped" keys) are stored in a per-shard overflow table and checked separately.

Benchmarks

Metric	Value
HIBP passwords	2,048,908,128
Filter size	1.79 GiB
Fingerprint bits	7
False positive rate	1/128 (~0.78%)
False negatives	0
`check(password)`	~14 µs
`checkHash(sha1hex)`	~8 µs
Throughput (single core)	~72k–121k/sec
npm dependencies	0

Comparison with HIBP API

	haveibeenfiltered	HIBP API
Privacy	Full — nothing leaves machine	k-anonymity (5-char SHA-1 prefix sent)
Speed	~14 µs/check	~200 ms/request
Offline	Yes	No
Setup	1.79 GB download	None
RAM	~1.8 GB	0
False positives	~0.78%	0% (exact)
Rate limits	None	Yes
Data freshness	Static snapshot	Continuously updated

Downloads

Filter binaries are hosted on Cloudflare R2 at https://files.haveibeenfiltered.com/v0.1/.

File	Size	SHA-256
`ribbon-hibp-v1.bin`	1.79 GiB	4eeb8608fa8541a51a952ecda91ad2f86e6f7457b0dbe34b88ba8a7ed33750ce
`ribbon-hibp-v1-min5.bin`	726 MB	4422f5659cb5fe39cf284b844328bfd3f7ab37fac0fe649b4cff216ffd2ac5da
`ribbon-hibp-v1-min10.bin`	435 MB	8c71d6a3696d27bcf21a30ddcd67f7e290a71210800db86810ffb84a426fe93e
`ribbon-hibp-v1-min20.bin`	259 MB	31a2c7942698fce74d95ce54dfb61f383ef1a33dce496b88c672e1ac07c71c43
`ribbon-rockyou-v1.bin`	12.8 MB	777d3c1640e7067bc7fb222488199c3371de5360639561f1f082db6b7c16a447
`ribbon-top1m-v1.bin`	0.9 MB	44f03ee81d777b42ba96deabde394f8aca8b8ef99134e15121c4e0c3fb3547c1
`ribbon-top10m-v1.bin`	9.0 MB	bdc40e88abf427354d408d67e79a31f7e2987dac0f1130c4d30f396062a9cd96

Integrity is verified via SHA-256 after download. The CLI handles downloading automatically:

npx haveibeenfiltered download

False Positives & False Negatives

A ribbon filter is a probabilistic data structure. It has two possible error types:

False positive (FP): A safe password is incorrectly reported as breached. This can happen because the filter stores compressed fingerprints, not exact hashes. The filter uses 7-bit fingerprints, giving a rate of 1/128 (~0.78%).
False negative (FN): A breached password is missed and reported as safe. This cannot happen. If a password is in the dataset, the filter will always detect it.

In practice this means: if check() returns false, the password is definitely not in the dataset. If it returns true, there is a ~0.78% chance it's a false alarm. For security applications this is the right tradeoff — you never miss a breached password.

FAQ

Can I use this in production?

Yes. The package has zero npm dependencies and uses only Node.js builtins (crypto, fs, path, https). The filter is loaded into memory once and all lookups are in-memory array operations. There is no network I/O during checks.

How often is the data updated?

The filter files are static snapshots. The current HIBP dataset was built from the Have I Been Pwned password list containing 2,048,908,128 passwords. Updated filter files will be released as new versions when the source data is updated.

What about false positives?

The filter uses 7-bit fingerprints, giving a theoretical false positive rate of 1/128 (~0.78%). This means roughly 1 in 128 passwords not in the breach dataset will incorrectly report as breached. There are zero false negatives — every breached password is always detected.

How much RAM does it need?

The filter is loaded entirely into memory. The HIBP dataset requires ~1.8 GB of RAM. Smaller datasets need less: top10m ~9 MB, top1m ~1 MB, RockYou ~13 MB. Memory is released when you call filter.close().

Does it phone home?

No. All password checking is fully local. The only network request the package ever makes is downloading the filter file, and only when you explicitly run npx haveibeenfiltered download or opt in with autoDownload: true. Downloads use HTTPS only and refuse redirects.