Skip to content

fetch/fetchpost: review-driven cache, panic and safety fixes#3747

Merged
jqnatividad merged 4 commits into
masterfrom
fetch-fetchpost-review-202604
Apr 26, 2026
Merged

fetch/fetchpost: review-driven cache, panic and safety fixes#3747
jqnatividad merged 4 commits into
masterfrom
fetch-fetchpost-review-202604

Conversation

@jqnatividad

Copy link
Copy Markdown
Collaborator

Summary

Review-driven correctness, panic-safety and clarity fixes for fetch and fetchpost. Refined across four commits in response to two roborev passes.

Critical bugs fixed

  • Disk-cache cache_remove key mismatch. cache_remove(&url) looked up bare URLs against composite keys built by the io_cached convert macro, so non-200 responses were never evicted from disk when --cache-error was unset. Fixed in fetch.rs (Disk path) and fetchpost.rs (InMemory + Disk + Redis paths — fetchpost had the bug on all three cache types, with the in-memory key actually being format!(\"{:?}\", form_body_jsonmap)).
  • Selection panic on empty selector. Reordered the sel.len() != 1 check before sel.iter().next().unwrap() so an empty URL-column selection returns a friendly error instead of panicking.
  • HeaderValue::to_str().unwrap() on adversarial headers. Replaced with .unwrap_or(\"\") in retry-after and ratelimit-* parsing so non-ASCII header bytes can't crash the process.

Other fixes

  • Bumped the post-error 10ns progress-bar drain sleep to 100 ms (covers one 5 Hz redraw cycle).
  • Replaced two to_lowercase() allocations on --report with a match on the first byte (ASCII-only by design — documented values are detailed / short / none).
  • Converted expand_tilde(dir).unwrap() to a graceful fail_clierror! for systems with no resolvable home directory.
  • Added // safety: comments on the remaining infallible unwraps (OnceLock::set, NonZeroU32::new on constants, infallible String write!, infallible simd_json::to_string of FetchResponse) per qsv conventions.

Refactor (in response to roborev 1680)

  • Extracted cross_session_cache_key() helper in both files. The convert macros and the cache_remove call sites now both go through the helper, eliminating the format-string duplication that introduced the original eviction bug. The fetchpost in-memory cache retains its body-only key but now has an accurate NOTE flagging the URL-collision edge case (widening was deferred as a behavior change).

Documentation (in response to roborev 1682)

  • Added a JSON RESPONSE HANDLING section to both USAGE blocks. Without --jaq, every successful response is parsed by serde_json and re-serialized: object key order is preserved (qsv enables preserve_order), but all insignificant whitespace is removed, number representations are canonicalized (e.g. 1e2 -> 100), duplicate keys are collapsed (last-value-wins), and non-JSON responses become empty cells (or the parse error if --store-error is set). Help-md regenerated.

Test updated

  • tests/test_fetch.rs::fetch_simple_diskcache was asserting cache_hit=1 for the 404 and "Invalid URL" rows on the second pass — i.e. it was encoding the disk-cache bug. Updated to expect cache_hit=0, with a comment explaining that non-200 responses are correctly evicted when --cache-error is unset.

Test plan

  • cargo build --locked --bin qsv -F all_features — clean
  • cargo +nightly fmt -- --check — clean
  • cargo +nightly clippy --bin qsv -F all_features -- -W clippy::perf — no fetch-related warnings
  • cargo test --locked -F all_features fetch — 41 passed, 1 ignored, 0 failed
  • Two roborev passes on the diff, both findings addressed and closed (1680, 1682)

Deferred (not in this PR)

  • #![allow(unused_assignments)] suppression — legitimate pre-allocation pattern; removing it forces Option<...> wrapping that fights the match-arm structure
  • Unconditional OnceLock::set at startup — cosmetic; only DEFAULT_REDIS_CONN_STRING is genuinely conditional, ~3 lines of churn
  • Loop-hoisted intermediate_value / intermediate_redis_value — the "amortization" comment is wrong (Strings are dropped on each reassignment) but moving the declarations is pure shuffling

🤖 Generated with Claude Code

jqnatividad and others added 4 commits April 26, 2026 08:32
- fix disk-cache cache_remove using bare URL instead of the composite key
  built by the io_cached convert macro, so non-200 responses were never
  evicted when --cache-error was unset (fetch.rs Disk path; fetchpost.rs
  InMemory/Disk/Redis paths — all three were broken)
- reorder the sel.len() != 1 check before sel.iter().next().unwrap() so an
  empty URL-column selection returns a friendly error instead of panicking
- replace HeaderValue::to_str().unwrap() with .unwrap_or("") in retry-after
  and ratelimit-* parsing so non-ASCII header bytes can't crash the process
- bump the post-error 10ns progress-bar drain sleep to 100ms (one 5Hz cycle)
- use match on first byte of --report instead of two to_lowercase() allocs
- convert expand_tilde(dir).unwrap() to a graceful fail_clierror! and add
  // safety: comments to the remaining infallible unwraps per qsv style
- update test_fetch::fetch_simple_diskcache: it was asserting cache_hit=1
  for error rows on the second pass, encoding the disk-cache bug — now
  asserts cache_hit=0, matching the corrected eviction behavior

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- extract cross_session_cache_key() helper in both files so the disk/redis
  cache-key format string lives in one place; the convert macros and the
  cache_remove call sites now both go through it, preventing the kind of
  drift that introduced the original eviction bug
- replace the misleading "we only need url in the cache key" comment above
  fetchpost::get_cached_response with an accurate NOTE that the in-memory
  cache is keyed on form_body_jsonmap only and flags the URL-collision
  edge case (pre-existing; widening deferred as a behavior change)
- add ASCII-only-by-design comment on the --report first-byte match in
  both files so future readers don't assume Unicode case folding

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address review finding #4 by adding a JSON RESPONSE HANDLING section to the
USAGE block of both commands. Without --jaq, every successful response is
parsed by serde_json and re-serialized, which silently normalizes object key
order, integer/float representation (e.g. 1e2 -> 100), and whitespace. Users
who need byte-exact server output now have a documented escape hatch
(post-process or use --jaq).

Help-md regenerated via `qsv --generate-help-md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address roborev 1682 findings:

- Medium: replace the incorrect "JSON object key order is sorted" claim
  with "object key order is preserved" — qsv enables the preserve_order
  feature on serde_json (Cargo.toml:272), so round-tripping through
  serde_json::Value preserves insertion order
- Low: expand "trailing whitespace is stripped" to "all insignificant
  whitespace is removed (compact) or re-indented (--pretty)" — the prior
  wording understated CompactFormatter's behavior
- Low: add the two normalizations users actually hit on this path —
  duplicate JSON object keys are collapsed (last value wins), and
  responses that are not valid JSON are written as an empty cell (or
  the parse error if --store-error is set)

Help-md regenerated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codacy-production

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity

Metric Results
Complexity 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves correctness and panic-safety in the fetch and fetchpost commands, primarily around cache eviction behavior and robustness when handling user input and HTTP headers.

Changes:

  • Fixes persistent cache eviction by aligning cache_remove keys with the io_cached convert key format via shared cross_session_cache_key() helpers.
  • Prevents panics from empty URL-column selections and from non-ASCII HTTP header values during rate-limit/retry parsing.
  • Updates CLI help/docs to clarify how JSON responses are normalized when --jaq is not used, and updates the disk-cache test expectation accordingly.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/cmd/fetch.rs Aligns disk/redis cache eviction keys, removes panic paths (selection + header parsing), and adds documentation and safety comments.
src/cmd/fetchpost.rs Same cache-key alignment across in-memory/disk/redis eviction paths, panic-safety fixes, and additional documentation/safety comments.
tests/test_fetch.rs Updates disk-cache report assertions to reflect correct eviction behavior for non-200 responses when --cache-error is unset.
docs/help/fetch.md Regenerated help content documenting JSON response normalization behavior.
docs/help/fetchpost.md Regenerated help content documenting JSON response normalization behavior.

@jqnatividad jqnatividad merged commit 282ff4b into master Apr 26, 2026
19 of 21 checks passed
@jqnatividad jqnatividad deleted the fetch-fetchpost-review-202604 branch April 26, 2026 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants