slice: review-driven panic and underflow fixes#3748
Conversation
Fixes panics and incorrect output for several edge cases in `qsv slice`: - Replace `r.unwrap()` in JSON / indexed code paths with `?` so CSV parse errors propagate instead of crashing the process. - Clamp `start` and `end` to `total_rows` in `with_index` so the invert path no longer subtracts `total_rows - end` when end > total_rows (panic in debug, capacity wrap in release) and never seeks past EOF. - Use `saturating_sub` for negative `--start` / `--index` so an offset larger than the row count clamps to row 0 rather than positioning past the end of the file (which silently produced empty output). - Replace `into_string().unwrap()` on the canonicalized input path with a typed `CliError` for non-UTF-8 paths. - Add early-return parity for `end == start && !--invert` in `no_index`. - Fix off-by-one in two USAGE examples (--end is exclusive). Adds 8 regression tests covering negative offsets larger than row count, --start at/past EOF with --invert, --end past EOF on both invert and non-invert indexed paths, and negative --index clamping. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Restore `[]` output for empty non-invert JSON slices: remove the no_index early return (both branches handle empty correctly) and add an explicit empty-headers/`[]` short-circuit in with_index so downstream JSON parsers always see a well-formed document on both paths (the prior commit silently dropped JSON output for the empty-slice case in no_index). - Deduplicate util::count_rows in range(): compute the row count once when either --start or --index is negative, instead of up to twice. - Add slice_empty_json_no_index and slice_empty_json_with_index regression tests to lock in the empty-JSON-slice contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Replace `total.unwrap()` in range() with `unwrap_or(0)` so the panic-removal invariant from the prior review survives any future drift in the `needs_count` predicate. Non-negative offsets never read the count, so 0 is a safe default. - Add slice_empty_json_no_headers_with_index regression test, locking in the empty-array contract for `--start N --end N --json --no-headers` on the indexed path. The empty-slice and non-empty JSON branches now both pass `byte_headers()` + `flag_no_headers` to `util::write_json` with identical shapes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 20 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Pull request overview
This PR hardens the slice command against panic/overflow edge-cases (especially in indexed + JSON paths) and expands regression coverage to lock in the corrected semantics for negative offsets and EOF-clamping.
Changes:
- Replace panic-prone
unwrap()s with error propagation (?) and clamp indexed seek ranges to avoid underflow/EOF seeks. - Adjust negative
--start/--indexhandling to clamp to row 0 viasaturating_sub, and improve behavior for empty JSON slices. - Add regression tests for EOF/negative-offset edge cases and fix
USAGEexample off-by-one wording.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/cmd/slice.rs |
Fixes panic/underflow paths, clamps indexed ranges, improves negative-offset handling, and updates CLI examples. |
tests/test_slice.rs |
Adds regression tests covering negative offsets, EOF clamping, invert behavior, and empty-slice JSON output. |
- Restore streaming JSON output: change util::write_json to accept `Iterator<Item = csv::Result<ByteRecord>>` and propagate parse errors via `?` inside the loop. Slice's no_index JSON path and with_index non-invert JSON path now stream straight through with no intermediate Vec, avoiding memory spikes/OOM on large slices. The with_index --invert JSON path still buffers because the two non-contiguous reads (before start / after end) require seeking between them; this matches pre-PR behavior. - Avoid duplicate util::count_rows on the indexed path: range() now takes an Option<usize> precomputed total. with_index() computes total_rows once and threads it through, while no_index() keeps passing None (lazy fetch only when negative offsets are used). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Document streaming-failure semantics in util::write_json: a CSV parse error mid-stream may leave the output truncated (open `[`, prior records, trailing comma) since the closing `]` is only written on success. Callers needing all-or-nothing must collect/validate first. - Restore all-or-nothing semantics on with_index --invert JSON: revert from `records.extend(...)` to a `push(r?)` loop so a parse error in segment 1 aborts before segment 2 is read. write_json gets `Vec<ByteRecord>.into_iter().map(Ok)` so it sees only success items — partial-output mode is now isolated to the streaming paths. - Cosmetic cleanup in range(): hoist the `Some(...)` wrapper outside the precomputed_total match to make it obvious that the computed value is always Some when needs_count is true. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens qsv slice against panics/underflow in several edge cases (especially on indexed + JSON paths), while preserving expected slice semantics (clamping, EOF handling) and strengthening regression coverage.
Changes:
- Refactors JSON writing to accept
Iterator<Item = csv::Result<ByteRecord>>, propagating CSV parse errors instead ofunwrap()panics and enabling streaming in key paths. - Fixes indexed slicing edge cases by clamping
start/endtototal_rows, avoiding underflow and preventingIndexed::seekpast EOF; restores[]output for empty JSON slices. - Adds focused regression tests for negative offsets, EOF boundary conditions, invert behavior, and empty-slice JSON contract (indexed and non-indexed).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
tests/test_slice.rs |
Adds regression tests covering negative offsets, EOF boundaries, invert behavior, and empty JSON slice output guarantees. |
src/util.rs |
Updates write_json to stream csv::Result<ByteRecord> records and propagate parse errors; documents streaming failure semantics. |
src/cmd/slice.rs |
Removes unwrap() on CSV parse results, clamps indexed ranges to avoid underflow/EOF seeks, improves non-UTF8 path handling, and fixes USAGE examples. |
- Use `indexed_file.count()` instead of `util::count_rows()` in with_index(): the indexed_file already has the index loaded, so reading the row count from it directly avoids reopening the CSV/index. - Fix doc comment in util::write_json: drop the incorrect "trailing comma" mention. write_json writes commas *before* each record except the first, so on error there's no trailing comma to leave behind — only the opening `[` and any prior records. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
r.unwrap()panics on CSV parse errors in the JSON / indexed code paths — propagate via?instead.total_rows - endsubtract overflow in thewith_indexinvert path by clampingstart/endtototal_rows; never callIndexed::seekpast EOF.saturating_sub(instead ofabs_diff) for negative--start/--indexso an offset larger than the row count clamps to row 0 rather than positioning past the end of the file (which silently produced empty output).into_string().unwrap()on the canonicalized input path with a typedCliErrorfor non-UTF-8 paths.[]JSON output for empty non-invert slices on both indexed and non-indexed paths so downstream parsers always see a well-formed document.util::count_rowsinrange(); useunwrap_or(0)so the panic-free invariant survives any future drift in the predicate.--endis exclusive).Adds 11 regression tests covering negative offsets larger than row count,
--startat/past EOF with--invert,--endpast EOF on both invert and non-invert indexed paths, negative--indexclamping, and the empty-slice JSON contract (with and without headers, with and without an index).Driven by three roborev review rounds (jobs 1680, 1684, 1685) — all closed.
Test plan
cargo build --locked --bin qsv -F all_featurescargo t slice -F all_features— 141 passed, 0 failedcargo clippy --bin qsv -F all_features— clean🤖 Generated with Claude Code