Skip to content

slice: review-driven panic and underflow fixes#3748

Merged
jqnatividad merged 6 commits into
masterfrom
slice-codereview-202604
Apr 26, 2026
Merged

slice: review-driven panic and underflow fixes#3748
jqnatividad merged 6 commits into
masterfrom
slice-codereview-202604

Conversation

@jqnatividad

Copy link
Copy Markdown
Collaborator

Summary

  • Fix r.unwrap() panics on CSV parse errors in the JSON / indexed code paths — propagate via ? instead.
  • Fix total_rows - end subtract overflow in the with_index invert path by clamping start/end to total_rows; never call Indexed::seek past EOF.
  • Use saturating_sub (instead of abs_diff) for negative --start / --index so an offset larger than the row count clamps to row 0 rather than positioning past the end of the file (which silently produced empty output).
  • Replace into_string().unwrap() on the canonicalized input path with a typed CliError for non-UTF-8 paths.
  • Restore [] JSON output for empty non-invert slices on both indexed and non-indexed paths so downstream parsers always see a well-formed document.
  • Deduplicate util::count_rows in range(); use unwrap_or(0) so the panic-free invariant survives any future drift in the predicate.
  • Fix off-by-one in two USAGE examples (--end is exclusive).

Adds 11 regression tests covering negative offsets larger than row count, --start at/past EOF with --invert, --end past EOF on both invert and non-invert indexed paths, negative --index clamping, and the empty-slice JSON contract (with and without headers, with and without an index).

Driven by three roborev review rounds (jobs 1680, 1684, 1685) — all closed.

Test plan

  • cargo build --locked --bin qsv -F all_features
  • cargo t slice -F all_features — 141 passed, 0 failed
  • cargo clippy --bin qsv -F all_features — clean

🤖 Generated with Claude Code

jqnatividad and others added 3 commits April 26, 2026 09:53
Fixes panics and incorrect output for several edge cases in `qsv slice`:

- Replace `r.unwrap()` in JSON / indexed code paths with `?` so CSV
  parse errors propagate instead of crashing the process.
- Clamp `start` and `end` to `total_rows` in `with_index` so the invert
  path no longer subtracts `total_rows - end` when end > total_rows
  (panic in debug, capacity wrap in release) and never seeks past EOF.
- Use `saturating_sub` for negative `--start` / `--index` so an offset
  larger than the row count clamps to row 0 rather than positioning
  past the end of the file (which silently produced empty output).
- Replace `into_string().unwrap()` on the canonicalized input path
  with a typed `CliError` for non-UTF-8 paths.
- Add early-return parity for `end == start && !--invert` in `no_index`.
- Fix off-by-one in two USAGE examples (--end is exclusive).

Adds 8 regression tests covering negative offsets larger than row count,
--start at/past EOF with --invert, --end past EOF on both invert and
non-invert indexed paths, and negative --index clamping.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Restore `[]` output for empty non-invert JSON slices: remove the
  no_index early return (both branches handle empty correctly) and
  add an explicit empty-headers/`[]` short-circuit in with_index so
  downstream JSON parsers always see a well-formed document on both
  paths (the prior commit silently dropped JSON output for the
  empty-slice case in no_index).
- Deduplicate util::count_rows in range(): compute the row count
  once when either --start or --index is negative, instead of up to
  twice.
- Add slice_empty_json_no_index and slice_empty_json_with_index
  regression tests to lock in the empty-JSON-slice contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Replace `total.unwrap()` in range() with `unwrap_or(0)` so the
  panic-removal invariant from the prior review survives any future
  drift in the `needs_count` predicate. Non-negative offsets never
  read the count, so 0 is a safe default.
- Add slice_empty_json_no_headers_with_index regression test, locking
  in the empty-array contract for `--start N --end N --json
  --no-headers` on the indexed path. The empty-slice and non-empty
  JSON branches now both pass `byte_headers()` + `flag_no_headers` to
  `util::write_json` with identical shapes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codacy-production

codacy-production Bot commented Apr 26, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 20 complexity

Metric Results
Complexity 20

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the slice command against panic/overflow edge-cases (especially in indexed + JSON paths) and expands regression coverage to lock in the corrected semantics for negative offsets and EOF-clamping.

Changes:

  • Replace panic-prone unwrap()s with error propagation (?) and clamp indexed seek ranges to avoid underflow/EOF seeks.
  • Adjust negative --start/--index handling to clamp to row 0 via saturating_sub, and improve behavior for empty JSON slices.
  • Add regression tests for EOF/negative-offset edge cases and fix USAGE example off-by-one wording.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/cmd/slice.rs Fixes panic/underflow paths, clamps indexed ranges, improves negative-offset handling, and updates CLI examples.
tests/test_slice.rs Adds regression tests covering negative offsets, EOF clamping, invert behavior, and empty-slice JSON output.

Comment thread src/cmd/slice.rs Outdated
Comment thread src/cmd/slice.rs Outdated
jqnatividad and others added 2 commits April 26, 2026 10:20
- Restore streaming JSON output: change util::write_json to accept
  `Iterator<Item = csv::Result<ByteRecord>>` and propagate parse
  errors via `?` inside the loop. Slice's no_index JSON path and
  with_index non-invert JSON path now stream straight through with
  no intermediate Vec, avoiding memory spikes/OOM on large slices.
  The with_index --invert JSON path still buffers because the two
  non-contiguous reads (before start / after end) require seeking
  between them; this matches pre-PR behavior.
- Avoid duplicate util::count_rows on the indexed path: range() now
  takes an Option<usize> precomputed total. with_index() computes
  total_rows once and threads it through, while no_index() keeps
  passing None (lazy fetch only when negative offsets are used).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Document streaming-failure semantics in util::write_json: a CSV parse
  error mid-stream may leave the output truncated (open `[`, prior
  records, trailing comma) since the closing `]` is only written on
  success. Callers needing all-or-nothing must collect/validate first.
- Restore all-or-nothing semantics on with_index --invert JSON: revert
  from `records.extend(...)` to a `push(r?)` loop so a parse error in
  segment 1 aborts before segment 2 is read. write_json gets
  `Vec<ByteRecord>.into_iter().map(Ok)` so it sees only success items
  — partial-output mode is now isolated to the streaming paths.
- Cosmetic cleanup in range(): hoist the `Some(...)` wrapper outside
  the precomputed_total match to make it obvious that the computed
  value is always Some when needs_count is true.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens qsv slice against panics/underflow in several edge cases (especially on indexed + JSON paths), while preserving expected slice semantics (clamping, EOF handling) and strengthening regression coverage.

Changes:

  • Refactors JSON writing to accept Iterator<Item = csv::Result<ByteRecord>>, propagating CSV parse errors instead of unwrap() panics and enabling streaming in key paths.
  • Fixes indexed slicing edge cases by clamping start/end to total_rows, avoiding underflow and preventing Indexed::seek past EOF; restores [] output for empty JSON slices.
  • Adds focused regression tests for negative offsets, EOF boundary conditions, invert behavior, and empty-slice JSON contract (indexed and non-indexed).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
tests/test_slice.rs Adds regression tests covering negative offsets, EOF boundaries, invert behavior, and empty JSON slice output guarantees.
src/util.rs Updates write_json to stream csv::Result<ByteRecord> records and propagate parse errors; documents streaming failure semantics.
src/cmd/slice.rs Removes unwrap() on CSV parse results, clamps indexed ranges to avoid underflow/EOF seeks, improves non-UTF8 path handling, and fixes USAGE examples.

Comment thread src/cmd/slice.rs Outdated
Comment thread src/util.rs Outdated
- Use `indexed_file.count()` instead of `util::count_rows()` in
  with_index(): the indexed_file already has the index loaded, so
  reading the row count from it directly avoids reopening the
  CSV/index.
- Fix doc comment in util::write_json: drop the incorrect "trailing
  comma" mention. write_json writes commas *before* each record
  except the first, so on error there's no trailing comma to leave
  behind — only the opening `[` and any prior records.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jqnatividad jqnatividad merged commit c807f7e into master Apr 26, 2026
17 checks passed
@jqnatividad jqnatividad deleted the slice-codereview-202604 branch April 26, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants