validate: review-driven correctness and consistency fixes#3743
Merged
Conversation
- DynEnumValidator::validate: avoid panic on non-string instance by
mirroring is_valid's Value::String pattern match.
- split_invalid_records: fix off-by-one bounds check (> → >=) that
could index valid_flags out of range when the input file has more
rows than the recorded valid_flags (e.g., grew between passes, or
fail-fast truncation).
- lite-mode dyn_enum_validator_factory: surface "column not found"
and header-read errors instead of silently defaulting to column 0
(mirrors the non-lite version's behavior).
- Drop unnecessary unsafe { unwrap_unchecked } on cold paths in the
Rayon validation loop; the surrounding cold_path() hint already
excludes these from any meaningful hot path.
- to_json_instance: surface NaN/inf as a parse error instead of
silently coercing to 0, which would let invalid numbers slip past
schema constraints like minimum/maximum.
Updated the validate_dynenum_with_invalid_column lite-mode assertion
to match the new (correct) error path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR applies review-driven correctness fixes to the validate command’s JSON Schema validation path, primarily preventing panics and eliminating silent mis-validation.
Changes:
- Prevent panics and silent fallbacks in
dynamicEnumvalidation (string-only validate, proper “column not found” errors in lite mode). - Fix an off-by-one bounds check when splitting valid/invalid outputs to avoid indexing past
valid_flags. - Improve numeric casting correctness by surfacing non-finite float values (
NaN/±inf) as errors instead of coercing to0.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/cmd/validate.rs |
Correctness and consistency fixes across dynamicEnum validation, splitting logic, error-path UTF-8 handling, and number parsing. |
tests/test_validate.rs |
Updates lite-mode assertion to match the new (correct) dynamicEnum error behavior instead of the prior silent fallback. |
Per repo convention; mirrors how other unwrap()/unsafe rationales are labeled. Also adds the safety comment to the second site that lost it in the previous commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes from a code review of
src/cmd/validate.rs:DynEnumValidator::validate— replacedinstance.as_str().unwrap()with theValue::Stringpattern thatis_validalready uses. The old code panics ifvalidateis ever entered with a non-string instance.split_invalid_recordsoff-by-one — changed>to>=in the bounds check. The old guard fires one row late, so when the input file has more rows thanvalid_flags(file grew between passes, or--fail-fasttruncation), the next iteration indexesvalid_flags[valid_flags_len]and panics.dyn_enum_validator_factory— was silently defaulting to column 0 when the named column wasn't found or whenrdr.headers()errored, which builtdynamicEnumsets from the wrong column with no warning. Now mirrors the non-lite version and returns a proper validation error.unsafe { unwrap_unchecked }in the Rayon validation loop — both call sites are already gated bystd::hint::cold_path(), so the perf savings are zero. Replaced with safeunwrap()plus a one-line comment explaining why it cannot fire (row number is appended viaitoa, always valid ASCII).to_json_instanceNaN/inf handling —fast_float2::parseaccepts"NaN"/"inf", thenNumber::from_f64rejects non-finite floats and the old fallback silently coerced to0. That would let invalid numbers slip past schema constraints likeminimum/maximum. Now surfaces aNon-finite Numbererror.Three other issues flagged during review were investigated and dismissed — see commit message and PR conversation for details (no real race in
UniqueCombinedWithValidator, no Windows file-handle conflict insplit_invalid_records,obj.values()ordering is correct becauseserde_jsonis built withpreserve_order).Test plan
cargo test test_validate -F all_features— 62/62 passcargo test test_validate -F lite— 62/62 pass (after updatingvalidate_dynenum_with_invalid_columnlite assertion to match the new correct error path; the old assertion was asserting the silent-fallback bug)cargo clippy --bin qsv -F all_features -- -D warnings— cleancargo build --bin qsv -F all_features— clean🤖 Generated with Claude Code