A fast, compact, immutable map from strings to uint64 values in Go. It uses the binary fuse filter construction to store key-value pairs in a compact array where lookup requires only one hash computation, three array accesses, and two XOR operations.
The data structure is ideal when you have a known set of string keys at construction time and need fast, memory-efficient lookups afterward.
This implementation is based on the binary fuse filter algorithm described in:
Thomas Mueller Graf and Daniel Lemire, Binary Fuse Filters: Fast and Smaller Than Xor Filters, ACM Journal of Experimental Algorithmics, Volume 27, 2022. DOI: 10.1145/3510449
See also the earlier xor filter paper:
Thomas Mueller Graf and Daniel Lemire, Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters, ACM Journal of Experimental Algorithmics, Volume 25, 2020. DOI: 10.1145/3376122
go get github.com/lemire/constmap
package main
import (
"fmt"
"log"
"github.com/lemire/constmap"
)
func main() {
keys := []string{"apple", "banana", "cherry"}
values := []uint64{100, 200, 300}
cm, err := constmap.New(keys, values)
if err != nil {
log.Fatal(err)
}
fmt.Println(cm.Map("banana")) // 200
}The keys and values slices must have equal length, and keys must be unique. After construction, the ConstMap is immutable. Looking up a key that was not in the original set returns an undefined value.
If you need to detect missing keys, use VerifiedConstMap. It stores an additional fingerprint per key and returns the sentinel NotFound for keys not in the original set:
vm, err := constmap.NewVerified(keys, values)
if err != nil {
log.Fatal(err)
}
fmt.Println(vm.Map("banana")) // 200
fmt.Println(vm.Map("grape")) // constmap.NotFound (0xFFFFFFFFFFFFFFFF)This doubles memory usage (~18 bytes/key instead of ~9) but lookup remains fast.
A ConstMap can be serialized to disk and loaded back later, avoiding the cost of reconstruction. The binary format includes a FNV-1a checksum to detect corruption.
// Save to file.
err := cm.SaveToFile("mymap.cmap")
// Load from file.
cm, err := constmap.LoadFromFile("mymap.cmap")For streaming use, WriteTo and ReadFrom work with any io.Writer / io.Reader:
// Write to any io.Writer.
n, err := cm.WriteTo(w)
// Read from any io.Reader.
var cm constmap.ConstMap
n, err := cm.ReadFrom(r)go test -v
The construction time is higher (as expected for any compact data structure), but lookups are optimized for speed. I ran benchmarks on my Apple M4 Max processor to compare constmap lookups against Go's built-in map[string]uint64. The test uses 1 million keys.
| Data Structure | Lookup Time | Memory Usage |
|---|---|---|
| ConstMap | 7.6 ns/op | 9 bytes/key |
| VerifiedConstMap | 13 ns/op | 18 bytes/key |
| Go Map | 23 ns/op | 56 bytes/key |
The benchmark suite compares ConstMap against Go's built-in map[string]uint64 using 1,000,000 keys. To run:
go test -bench=. -benchmem
There are three benchmarks:
- BenchmarkConstMap -- lookup throughput for
ConstMap.Map() - BenchmarkVerifiedConstMap -- lookup throughput for
VerifiedConstMap.Map() - BenchmarkGoMap -- lookup throughput for Go's built-in map
For stable, reproducible results:
go test -bench=. -benchmem -count=5 -benchtime=3s
The -count=5 flag runs each benchmark five times so you can assess variance. The -benchtime=3s flag gives each iteration more time to stabilize.
Run the memory comparison test to see the retained memory of each data structure with 1,000,000 keys:
go test -run TestMemoryUsage -v
The ConstMap stores approximately 1.125 x n x 8 bytes (roughly 9 bytes per key), the VerifiedConstMap uses twice that (~18 bytes/key), while Go's map[string]uint64 typically uses around 50-60 bytes per key for keys of this size.
Given n key-value pairs, the algorithm:
- Hashes each key (using xxhash) and maps it to three positions in an array of size ~1.125n using overlapping segments.
- Uses a peeling process to find an ordering where each key can be assigned to one of its three positions uniquely.
- Walks the ordering in reverse, setting each array cell so that
array[h0] XOR array[h1] XOR array[h2] == valuefor every key.
Lookup computes the same three positions and XORs the three array cells to recover the value. This gives O(1) lookup with minimal memory overhead.
Compared to xor filters which divide the array into three equal blocks (~1.23n overhead), binary fuse filters use overlapping segments for better locality and a lower space overhead (~1.125n), and they construct faster.
Apache License 2.0. See LICENSE for details.