You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
compress.deflate and the new HTTP/2 HPACK code (#27353) both implement
canonical Huffman coding. Per discussion on #27353 (with @JalonSolov), the
shared piece should be factored into a common builder — tentatively hash.huffman — that both modules use, rather than duplicating the logic.
This issue tracks the design and the follow-up PR. PR #27353 lands as-is; this
is deliberately separate so the cross-cutting refactor of compress.deflate
(which is stable and used by zlib/gzip) gets its own review.
What is actually shared
The canonical code-assignment step: given an array of per-symbol bit lengths,
produce the per-symbol codes (and a decode structure). Verified that HPACK's
RFC 7541 Appendix B table is canonical — all 257 codes rebuild exactly from the
lengths alone using the same bl_count/next_code algorithm compress.deflate
already uses. So HPACK could ship only the lengths and derive its codes via the
shared builder.
What differs (and must be parameters, not assumptions)
Max code length: DEFLATE caps at 15 bits and decodes via a single flat 2^max_bits table (~32K entries); HPACK codes go up to 30 bits, where a flat 2^30 table is infeasible (needs a tree / multi-level table / bit-at-a-time
decode). So the builder must choose its decode representation based on the
bit-width.
Bit order: DEFLATE is LSB-first (it bit_reverses every code); HPACK is
MSB-first.
Not shared: the actual bit I/O loops, and codec-specific semantics
(HPACK EOS/all-ones padding; DEFLATE end-of-block, extra-bits, distance
alphabet, length-code RLE).
Open design question raised on the PR: whether those params should have
defaults. Current lean is no defaults, with the config marked required / @[noinit] so callers must specify them explicitly. Decide as part of the PR.
Acceptance criteria
A shared canonical-Huffman builder in hash.huffman (or agreed location).
compress.deflate migrated to it with no regression to its flat-table
decode performance (benchmark before/after).
net.http HPACK migrated to build its codes from the lengths array via the
shared builder.
Tests for the shared module plus the existing deflate/HPACK suites stay green.
Summary
compress.deflateand the new HTTP/2 HPACK code (#27353) both implementcanonical Huffman coding. Per discussion on #27353 (with @JalonSolov), the
shared piece should be factored into a common builder — tentatively
hash.huffman— that both modules use, rather than duplicating the logic.This issue tracks the design and the follow-up PR. PR #27353 lands as-is; this
is deliberately separate so the cross-cutting refactor of
compress.deflate(which is stable and used by zlib/gzip) gets its own review.
What is actually shared
The canonical code-assignment step: given an array of per-symbol bit lengths,
produce the per-symbol codes (and a decode structure). Verified that HPACK's
RFC 7541 Appendix B table is canonical — all 257 codes rebuild exactly from the
lengths alone using the same
bl_count/next_codealgorithmcompress.deflatealready uses. So HPACK could ship only the lengths and derive its codes via the
shared builder.
What differs (and must be parameters, not assumptions)
2^max_bitstable (~32K entries); HPACK codes go up to 30 bits, where a flat2^30table is infeasible (needs a tree / multi-level table / bit-at-a-timedecode). So the builder must choose its decode representation based on the
bit-width.
bit_reverses every code); HPACK isMSB-first.
(HPACK EOS/all-ones padding; DEFLATE end-of-block, extra-bits, distance
alphabet, length-code RLE).
Proposed shape (per @JalonSolov)
A
hash.huffmanbuilder parameterized by:Open design question raised on the PR: whether those params should have
defaults. Current lean is no defaults, with the config marked
required/@[noinit]so callers must specify them explicitly. Decide as part of the PR.Acceptance criteria
hash.huffman(or agreed location).compress.deflatemigrated to it with no regression to its flat-tabledecode performance (benchmark before/after).
net.httpHPACK migrated to build its codes from the lengths array via theshared builder.
References
Note
You can use the 👍 reaction to increase the issue's priority for developers.
Please note that only the 👍 reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.