Skip to content

build(deps): bump jsonschema from 0.18.2 to 0.18.3#2134

Merged
jqnatividad merged 1 commit into
masterfrom
dependabot/cargo/jsonschema-0.18.3
Sep 12, 2024
Merged

build(deps): bump jsonschema from 0.18.2 to 0.18.3#2134
jqnatividad merged 1 commit into
masterfrom
dependabot/cargo/jsonschema-0.18.3

Conversation

@dependabot

@dependabot dependabot Bot commented on behalf of github Sep 12, 2024

Copy link
Copy Markdown
Contributor

Bumps jsonschema from 0.18.2 to 0.18.3.

Release notes

Sourced from jsonschema's releases.

[Python] Release 0.18.3

Fixed

  • Changing base URI when $ref is present in drafts 7 and earlier.

[Rust] Release 0.18.3

Fixed

  • Changing base URI when $ref is present in drafts 7 and earlier.
  • Removed dbg! macro.
Changelog

Sourced from jsonschema's changelog.

[0.18.3] - 2024-09-12

Fixed

  • Changing base URI when $ref is present in drafts 7 and earlier.
  • Removed dbg! macro.
Commits
  • 11ebbaa chore(rust): Release 0.18.3
  • 0d359cc fix: Remove dbg! macro
  • 8cb04dd fix: Changing base URI when $ref is present in drafts 7 and earlier
  • ce678a1 build(deps): bump actions/setup-python from 4 to 5
  • 432e434 build(deps): update boon requirement from 0.5 to 0.6 in /jsonschema
  • cb66c44 build(deps): bump actions/checkout from 3 to 4
  • 3acee86 build: Create dependabot.yml
  • b843018 build: update codspeed
  • cbb86d0 chore(python): Release 0.18.2
  • 6b0a211 fix(build): Python release workflow
  • See full diff in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [jsonschema](https://github.com/Stranger6667/jsonschema-rs) from 0.18.2 to 0.18.3.
- [Release notes](https://github.com/Stranger6667/jsonschema-rs/releases)
- [Changelog](https://github.com/Stranger6667/jsonschema-rs/blob/master/CHANGELOG.md)
- [Commits](Stranger6667/jsonschema@rust-v0.18.2...rust-v0.18.3)

---
updated-dependencies:
- dependency-name: jsonschema
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file rust Pull requests that update Rust code labels Sep 12, 2024
@jqnatividad jqnatividad merged commit 68e1a15 into master Sep 12, 2024
@dependabot dependabot Bot deleted the dependabot/cargo/jsonschema-0.18.3 branch September 12, 2024 13:14
jqnatividad added a commit that referenced this pull request May 15, 2026
…tions

5 findings on the initial commit (4d706e1) — 1 HIGH already addressed in
899de97 (index_files() Ok(None) check), 4 still open:

* MEDIUM (generator.rs `parse_f64`): `parse_f64("NaN")` returned
  `Some(f64::NAN)`, which slipped past both the `hi < lo` short-circuit and
  the quartile sanity check, and would then panic in `rng.random_range`.
  Added `.filter(|v| v.is_finite())` so non-finite endpoints fail at parse
  time. Same path now rejects `±Infinity`.

* LOW (USAGE / docs): "logical value maps consistently" was unqualified, but
  for cardinality > CARDINALITY_POOL_CAP (100k) we generate a fresh fake per
  row. Spelled this out in the USAGE.

* LOW (mod.rs USAGE Examples block): the second example had two `$ qsv`
  lines under one comment, so the MCP-skill generator gave both the same
  description. Split into two comments — "First, generate the Data Dictionary
  with describegpt" / "Then layer in semantic fakers from the dictionary" —
  and regenerated qsv-synthesize.json + synthesize.md.

* LOW (qsv-synthesize.json missing `--jobs`): not addressed by design.
  `--jobs` is intentionally skipped from MCP skills as "infrastructure
  setting" by mcp_skills_gen.rs:367 — verified the same flag is omitted from
  qsv-stats.json (29 options, no jobs), qsv-frequency.json (35 options, no
  jobs). This is a project-wide convention, not a synthesize-specific bug.

Tests: 12 unit + 9 integration, all passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jqnatividad added a commit that referenced this pull request May 15, 2026
* feat(synthesize): add `synthesize` command to generate statistically-faithful synthetic CSVs

`qsv synthesize <input.csv>` runs `stats` + `frequency` + `count` on the source
and emits N rows of synthetic data that reproduce the source's per-column
attributes:

* Categorical / low-cardinality columns are reproduced by frequency-weighted
  sampling of the *real* value set — cardinality, weights and repetition
  structure preserved exactly.
* Numeric and date/datetime columns use quartile-bucketed generation so the
  *shape* of the distribution is preserved, not just `[min, max]`.
* Null ratios are reproduced per column.
* `--seed` makes output fully reproducible (single master `StdRng` threads
  through both selection logic and every faker call).
* `--dictionary <file>` layers in semantic Content Types from
  `describegpt --dictionary --infer-content-type` — each token maps to a
  `fake-rs` faker (40-token vocabulary covers names, emails, addresses,
  UUIDs, etc.). Bounded-cardinality faker columns sample from a fixed
  pre-generated pool of distinct fake values (the consistency mechanism, so a
  given logical value maps consistently).
* `--infer-content-type` runs `describegpt` internally to build the dictionary
  on the fly (needs `QSV_LLM_APIKEY`).

Cross-column correlation is explicitly out of scope for v1; columns are
generated independently.

Implementation notes:

* New module `src/cmd/synthesize/{mod,dictionary,faker_map,generator}.rs`.
  Reuses `util::run_qsv_cmd` to shell out to `stats` / `frequency` / `count`
  (the same primitive `describegpt::perform_analysis` uses) — no refactor of
  describegpt itself was needed.
* Bumped a handful of `pub(super)` items in `src/cmd/describegpt/dictionary.rs`
  to `pub(crate)` (`StatsRecord`, `FrequencyRecord`, `parse_stats_csv`,
  `parse_frequency_csv`, `CONTENT_TYPE_VOCAB`) and made the `dictionary`
  submodule `pub(crate)` so `synthesize` can consume them.
* `fake = "5"` (features: `derive`, `chrono`, `uuid`, `random_color`) added
  as an optional dep. The new `synthesize` feature gates the dep and is wired
  into `distrib_features` and `qsvmcp` (not `lite`, not `datapusher_plus`).
  `fake` v5.1.0 uses `rand 0.10` (same as qsv), so its `fake_with_rng` accepts
  qsv's seeded `StdRng` directly — no RNG version-bridging needed.
* Worked around a determinism bug in fake-rs's `Dummy<IP<L>> for String`
  (ignores the passed RNG); `ip_address` uses `IPv4` directly instead.
* Made the help-markdown and MCP-skills generators support module-dir
  commands (`src/cmd/<name>/mod.rs`), not just flat `src/cmd/<name>.rs`.
* Regen also picked up genuine doc drift in `qsv-stats.json`,
  `qsv-frequency.json`, and `docs/help/frequency.md` (Apache DataSketches
  big-endian notes) — included.

v1 is English-only; `--locale` is reserved for future multi-locale support
(each fake-rs locale is a distinct type, so multi-locale dispatch needs
macro expansion).

Tests: 12 unit tests + 9 integration tests, all passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* address review: suppress DevSkim false positives in synthesize

15 DevSkim code-scanning alerts on the synthesize PR, all false positives:

* DS148264 ("weak/non-cryptographic RNG") — 12 hits on `StdRng::seed_from_u64`
  in *test code*. The whole point of these tests is determinism with a fixed
  seed; synthesize generates *fake data*, not security tokens. Added
  `// DevSkim: ignore DS148264` inline (matches the existing pattern in
  `select.rs:142`, `sample.rs:406`, `sort.rs:202-214`, `pragmastat.rs:668`).

* DS126858 ("weak/broken hash algorithm") — 3 hits on the variable name `cmd2`
  in `tests/test_synthesize.rs`. The substring `md2` matched DevSkim's
  legacy-hash regex (md2/md4/md5/sha0/sha1). Renamed `cmd1`/`cmd2` →
  `first_cmd`/`second_cmd` — clearer anyway, no behavior change.

Tests: 12 unit + 9 integration, all still passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* address review: Copilot fixes — count/--jobs, index fallback, date sampling

Three valid issues caught by Copilot's PR review:

* `--jobs` was passed to `qsv count`, which doesn't accept it. Split the
  analysis args into `delim_only_args` (for `count`, which accepts `-d` but
  not `--jobs`) and `analysis_common_args` (for `stats` / `frequency`, which
  accept both). `index` continues to get no extra args.

* Best-effort indexing was effectively dead code: `Config::index_files()`
  returns `Ok(None)` (not `Err`) when no index exists, so `.is_err()` was
  false for the common unindexed case and the `qsv index` subprocess never
  fired. Switched to `!matches!(..., Ok(Some(_)))` so we now actually create
  the index when one is missing. Confirmed end-to-end — the "Indexed input"
  status line now appears for unindexed inputs.

* `Date` columns severely under-sampled the max date: `stats` min/max/q*
  values for `Date` are at midnight, so sampling uniformly over epoch seconds
  put only a single tick of the max-day in the last bucket (the rest of that
  day fell outside any bucket). Now `parse_epoch` returns *whole days since
  the UNIX epoch* for `Date` columns (and still seconds for `DateTime`); the
  `DateQuantile::next` arm samples a whole day uniformly within the bucket
  and multiplies back to seconds for formatting. `DateTime` sampling is
  unchanged.

Tests: 12 unit + 9 integration, all passing. Verified `qsv synthesize ...
--jobs 4` works end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* address roborev #2134: NaN guard, cardinality caveat, example descriptions

5 findings on the initial commit (4d706e1) — 1 HIGH already addressed in
899de97 (index_files() Ok(None) check), 4 still open:

* MEDIUM (generator.rs `parse_f64`): `parse_f64("NaN")` returned
  `Some(f64::NAN)`, which slipped past both the `hi < lo` short-circuit and
  the quartile sanity check, and would then panic in `rng.random_range`.
  Added `.filter(|v| v.is_finite())` so non-finite endpoints fail at parse
  time. Same path now rejects `±Infinity`.

* LOW (USAGE / docs): "logical value maps consistently" was unqualified, but
  for cardinality > CARDINALITY_POOL_CAP (100k) we generate a fresh fake per
  row. Spelled this out in the USAGE.

* LOW (mod.rs USAGE Examples block): the second example had two `$ qsv`
  lines under one comment, so the MCP-skill generator gave both the same
  description. Split into two comments — "First, generate the Data Dictionary
  with describegpt" / "Then layer in semantic fakers from the dictionary" —
  and regenerated qsv-synthesize.json + synthesize.md.

* LOW (qsv-synthesize.json missing `--jobs`): not addressed by design.
  `--jobs` is intentionally skipped from MCP skills as "infrastructure
  setting" by mcp_skills_gen.rs:367 — verified the same flag is omitted from
  qsv-stats.json (29 options, no jobs), qsv-frequency.json (35 options, no
  jobs). This is a project-wide convention, not a synthesize-specific bug.

Tests: 12 unit + 9 integration, all passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf(synthesize): Reuse String buffer and simplify Option handling

Declare a mutable String outside the loop and reuse it to avoid repeated allocations, adding #[allow(unused_assignments)] to suppress warnings. Replace the match on faker_map::content_type_to_value(...) with a direct call using the ? operator to return None early on failure, reducing nesting and simplifying control flow in build_faker_pool.

* address roborev #2138: drop fake "buffer reuse" in build_faker_pool

The prior commit hoisted `let mut value = String::new()` outside the loop
under the banner of "reuse String buffer", but `content_type_to_value`
returns a freshly-allocated `String` each call (the underlying fake-rs
fakers always materialize an owned `String`). Reassigning the hoisted
variable just drops the previous allocation and moves in the new one —
no allocations are amortized. The `#[allow(unused_assignments)]` and the
extra block were also pure noise.

Dropped the hoist, the attribute, and the block. Behavior unchanged:

    let value = faker_map::content_type_to_value(content_type, rng)?;
    if seen.insert(value.clone()) {
        pool.push(value);
    }

Actual buffer reuse would require changing `content_type_to_value` to
write into a `&mut String`, but that wouldn't help either — fake-rs's
`fake_with_rng::<String, _>` still allocates internally on every call.

Tests: 12 unit + 9 integration, all passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant