Skip to content

validate: review-driven correctness and consistency fixes#3743

Merged
jqnatividad merged 2 commits into
masterfrom
validate-review-202604
Apr 25, 2026
Merged

validate: review-driven correctness and consistency fixes#3743
jqnatividad merged 2 commits into
masterfrom
validate-review-202604

Conversation

@jqnatividad

Copy link
Copy Markdown
Collaborator

Summary

Fixes from a code review of src/cmd/validate.rs:

  • DynEnumValidator::validate — replaced instance.as_str().unwrap() with the Value::String pattern that is_valid already uses. The old code panics if validate is ever entered with a non-string instance.
  • split_invalid_records off-by-one — changed > to >= in the bounds check. The old guard fires one row late, so when the input file has more rows than valid_flags (file grew between passes, or --fail-fast truncation), the next iteration indexes valid_flags[valid_flags_len] and panics.
  • Lite-mode dyn_enum_validator_factory — was silently defaulting to column 0 when the named column wasn't found or when rdr.headers() errored, which built dynamicEnum sets from the wrong column with no warning. Now mirrors the non-lite version and returns a proper validation error.
  • Unnecessary unsafe { unwrap_unchecked } in the Rayon validation loop — both call sites are already gated by std::hint::cold_path(), so the perf savings are zero. Replaced with safe unwrap() plus a one-line comment explaining why it cannot fire (row number is appended via itoa, always valid ASCII).
  • to_json_instance NaN/inf handlingfast_float2::parse accepts "NaN"/"inf", then Number::from_f64 rejects non-finite floats and the old fallback silently coerced to 0. That would let invalid numbers slip past schema constraints like minimum/maximum. Now surfaces a Non-finite Number error.

Three other issues flagged during review were investigated and dismissed — see commit message and PR conversation for details (no real race in UniqueCombinedWithValidator, no Windows file-handle conflict in split_invalid_records, obj.values() ordering is correct because serde_json is built with preserve_order).

Test plan

  • cargo test test_validate -F all_features — 62/62 pass
  • cargo test test_validate -F lite — 62/62 pass (after updating validate_dynenum_with_invalid_column lite assertion to match the new correct error path; the old assertion was asserting the silent-fallback bug)
  • cargo clippy --bin qsv -F all_features -- -D warnings — clean
  • cargo build --bin qsv -F all_features — clean

🤖 Generated with Claude Code

- DynEnumValidator::validate: avoid panic on non-string instance by
  mirroring is_valid's Value::String pattern match.
- split_invalid_records: fix off-by-one bounds check (> → >=) that
  could index valid_flags out of range when the input file has more
  rows than the recorded valid_flags (e.g., grew between passes, or
  fail-fast truncation).
- lite-mode dyn_enum_validator_factory: surface "column not found"
  and header-read errors instead of silently defaulting to column 0
  (mirrors the non-lite version's behavior).
- Drop unnecessary unsafe { unwrap_unchecked } on cold paths in the
  Rayon validation loop; the surrounding cold_path() hint already
  excludes these from any meaningful hot path.
- to_json_instance: surface NaN/inf as a parse error instead of
  silently coercing to 0, which would let invalid numbers slip past
  schema constraints like minimum/maximum.

Updated the validate_dynenum_with_invalid_column lite-mode assertion
to match the new (correct) error path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codacy-production

codacy-production Bot commented Apr 25, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity

Metric Results
Complexity 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR applies review-driven correctness fixes to the validate command’s JSON Schema validation path, primarily preventing panics and eliminating silent mis-validation.

Changes:

  • Prevent panics and silent fallbacks in dynamicEnum validation (string-only validate, proper “column not found” errors in lite mode).
  • Fix an off-by-one bounds check when splitting valid/invalid outputs to avoid indexing past valid_flags.
  • Improve numeric casting correctness by surfacing non-finite float values (NaN/±inf) as errors instead of coercing to 0.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/cmd/validate.rs Correctness and consistency fixes across dynamicEnum validation, splitting logic, error-path UTF-8 handling, and number parsing.
tests/test_validate.rs Updates lite-mode assertion to match the new (correct) dynamicEnum error behavior instead of the prior silent fallback.

Comment thread src/cmd/validate.rs Outdated
Comment thread src/cmd/validate.rs
Per repo convention; mirrors how other unwrap()/unsafe rationales are
labeled. Also adds the safety comment to the second site that lost it
in the previous commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jqnatividad jqnatividad merged commit a092ef5 into master Apr 25, 2026
16 checks passed
@jqnatividad jqnatividad deleted the validate-review-202604 branch April 25, 2026 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants