build(deps): bump jsonschema from 0.28.1 to 0.28.2#2469

dependabot · 2025-01-22T12:51:01Z

Bumps jsonschema from 0.28.1 to 0.28.2.

Release notes

[Python] Release 0.28.2

Fixed

Resolving external references that are nested inside local references. #671

Resolving relative references with fragments against base URIs that also contain fragments. #666

Performance

Faster JSON pointer resolution.

[Rust] Release 0.28.2

Fixed

Resolving external references that are nested inside local references. #671

Resolving relative references with fragments against base URIs that also contain fragments. #666

Performance

Faster JSON pointer resolution.

Changelog

Sourced from jsonschema's changelog.

[0.28.2] - 2025-01-22

Fixed

Resolving external references that nested inside local references. #671

Resolving relative references with fragments against base URIs that also contain fragments. #666

Performance

Faster JSON pointer resolution.

Commits

7c59034 chore(rust): Release 0.28.2
615fa1e fix: Resolving relative references with fragments
1b75969 build(deps): bump crates/jsonschema-referencing/tests/suite
73a2e6f perf: Faster JSON pointer resolution
210eebb chore: Clippy lints
7a2ac3e fix: Resolving external references nested inside local references
29b37f0 chore(python): Release 0.28.1
See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [jsonschema](https://github.com/Stranger6667/jsonschema) from 0.28.1 to 0.28.2. - [Release notes](https://github.com/Stranger6667/jsonschema/releases) - [Changelog](https://github.com/Stranger6667/jsonschema/blob/master/CHANGELOG.md) - [Commits](Stranger6667/jsonschema@rust-v0.28.1...rust-v0.28.2) --- updated-dependencies: - dependency-name: jsonschema dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>

@graph

…ven validation, force:true, UUID URL-title walk-up, GSA-bundle deferral) (#3904) * feat(profile): walk past UUID-like basenames in url_title_default (§5.9) For CKAN-style `/datastore/dump/<uuid>` URLs the leaf basename is an opaque UUID — better than the random tempfile suffix but still not a usable title. Walk one level up (capped at 3) and return the first non-UUID-like segment we find. The classic CKAN dump URL now yields "dump" instead of a 36-char hex. New helper `is_uuid_like()` matches: - canonical 8-4-4-4-12 hex with dashes - compact 32 contiguous hex characters Both case-insensitive. Other ID-like patterns (MongoDB ObjectId at 24 hex, ULIDs, slugified IDs) are intentionally NOT matched — over-eager matching would walk past legitimate titles like "2024-Q3". Behavior: /datastore/dump/<uuid> -> "dump" (was: uuid) /path/snapshots/<32-hex> -> "snapshots" (was: hex) /datastore/dump/2024-Q3-payments.csv -> "2024-Q3-payments" (unchanged) /<uuid>/<uuid>/<uuid> -> leaf uuid (fallback after cap) 36-char non-hex string -> unchanged (length-collision check) If every candidate up the 3-level cap is UUID-like, falls back to the leaf UUID — still reproducible, still beats the tempfile suffix. Users wanting a prettier title supply `--initial-context.package.title`; a CKAN `/api/3/action/resource_show?id=<uuid>` lookup is a deferred follow-up. The previous `url_title_preserves_uuid_basename_unchanged` test documented the old behavior — replaced with four new tests covering the walk, the all-UUID fallback, the normal-basename regression check (including a 36-char non-hex length-collision case), and an `is_uuid_like` unit-level matrix of positives + negatives. Verified: 99 profile unit tests pass; 15 integration tests pass under both -F all_features and -F datapusher_plus. cargo +nightly fmt clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(profile): add sibling-URL + JSON-LD DCAT discovery (§5.2) Wires two more mechanisms into dcat_discover::discover, chained in priority order after the existing Link: rel=describedBy probe: 2. Sibling URLs by convention (qsv profile follow-ups §5.2). Four candidates tried in order: - <url>.metadata.json (qsv profile's own output naming) - <url>.dcat.json (common DCAT-JSON convention) - <dirname>/datapackage.json (Frictionless Data Package spec) - <host>/.well-known/data.json (DCAT-US site catalog) 3. HTML JSON-LD <script type="application/ld+json"> blocks in the URL's parent (landing-page) HTML. Open-data portals typically host the dataset page one level above the raw CSV download. Implementation: - New `discover_via_sibling_urls` + `sibling_candidates` helper. Hand-rolled .metadata.json/.dcat.json suffixing preserves query strings (textual append); url::Url-based construction for the datapackage.json and /.well-known/data.json variants drops query & fragment since they're host-relative, not input-relative. - New `discover_via_html_jsonld` + `extract_jsonld_blocks` helper. Pure-string HTML scan (no parser dep): locate <script ...> tags, case-insensitive type-attribute check for application/ld+json, parse the body as JSON, run through extract_dcat_dataset (which already handles @graph envelopes + bare-object shape fallback). Skips response if neither Content-Type nor body sniff suggests HTML — avoids wasted scans on PDFs or binary blobs served with no Content-Type. - New `fetch_json_and_extract` shared GET helper, mirroring discover_via_link_header's 4 MiB body cap. Module doc comment updated: the §5.2 "follow-up" markers are replaced with the new active descriptions. Nine unit tests added (sibling_candidates × 3, extract_jsonld_blocks × 6) — pure-string, no network. Covers typical CSV URL, query+fragment stripping, host-only URLs, basic <script> match, mixed-case type attribute, walking past non-dataset blocks, no-match negative, unrelated <script> tags, and the @graph envelope variant. Verified: 108 profile unit tests pass (was 99, +9 new); 15 integration tests pass under both -F all_features and -F datapusher_plus. cargo +nightly fmt + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(profile): run qsv validate when spec declares validators (§5.8) When the scheming spec declares one or more `validators` on any field (dataset_fields or resource_fields), invoke `qsv validate` against the input and merge any RFC4180 failures into dcat_warnings. The presence of validators is the trigger; their string content isn't interpreted yet — auto-generating a JSON Schema from declared types + CKAN validators is a future enhancement, but the architectural hook is in place. Implementation: - `Spec::has_validators()` walks both dataset_fields and resource_fields, returns true if any field's extras carry a non-empty, non-whitespace `validators` string. Whitespace-only entries are intentionally treated as "not declared" so empty but present entries don't accidentally trigger. - `run_profile_validation(input_path) -> Vec<DcatWarning>` spawns `qsv validate <input>` directly (not via util::run_qsv_cmd, which errors on non-zero exit — the validate path needs to succeed when the subprocess fails). Best-effort: spawn errors, missing binary, or non-UTF-8 stderr all silently degrade to "no warnings". Emits a `qsv profile: ran `validate`` status line on stderr, mirroring the existing `ran `frequency`` / `ran `count`` markers so the helper's invocation is observable. - Wired into the existing dcat_warnings merge block in profile.rs::run, alongside the build-time warning filter and --validate-dcat schema-violation path. Independent of --validate-dcat (which validates the emitted dcat block, not the input CSV). Failures land as DcatWarning entries with: field = "qsv:validation" severity = Required message = "input failed `qsv validate` (RFC4180): <detail>" Tests: - Four unit tests on Spec::has_validators: dataset-side trigger, resource-side trigger, none-declared negative, whitespace-only negative. - Two integration tests on the trigger plumbing: profile_runs_validation_when_spec_declares_validators (clean CSV + druf spec → validate spawns, no qsv:validation warning), profile_skips_validation_when_spec_has_no_validators (spec-less → validate must NOT spawn). Verified: 112 profile unit tests pass (was 108, +4); 17 integration tests pass under both -F all_features and -F datapusher_plus. cargo +nightly fmt + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(profile): explain why GSA bundle vendoring is deferred (§5.3) The handoff suggested vendoring the full GSA dcat-us JSON Schema suite as a drop-in replacement for embedded_minimal_schema. While investigating, hit a fundamental shape mismatch the original plan didn't account for: the GSA bundle is written against the **unprefixed** JSON-LD-expanded form (`otherIdentifier`, `@type: "Dataset"`) while `dcat::build` emits the **prefixed JSON-LD-compact** form (`dct:identifier`, `@type: "dcat:Dataset"`). Naïvely vendoring the bundle and pointing the validator at it would flag every key as missing. Updated the dcat_validate module-level doc comment to spell out the three real paths forward (JSON-LD expansion, key translation layer, refactor dcat::build to emit expanded form) and why each is bigger scope than a vendor-and-swap. Embedded minimal schema stays in place — it catches the mandatory-field class of mistake cheaply. No code changes; doc-only commit so the next maintainer doesn't re-do the same investigation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(profile): honor force:true in dataset_info at merge time (§5.4) Wraps the existing `{value, force: true}` plumbing with real merge-time effect for `dataset_info` JSON-Pointer entries. Discovered DCAT (Link header / sibling URL / JSON-LD <script>) will no longer overlay paths the user marked forced — even when the inferred projection left them absent. Use case: declare a field "intentionally absent" and prevent publisher DCAT discovery from silently filling it in. Example: {"dataset_info": {"/dcat/dct:rights": {"value": null, "force": true}}} yields literal `null` at `/dcat/dct:rights` AND blocks any discovered `dct:rights` from being merged in. Implementation: - `collect_forced_dataset_info_paths(raw)` walks the `dataset_info` subtree BEFORE `normalize_value_force` strips the wrappers and collects pointer paths whose value matched the exact two-key `{"value": ..., "force": true}` shape. `force: false` and plain values aren't collected. - `load_initial_context` signature extended: returns `(package, resource, dataset_info, forced_dcat_paths)`. The previous wrapper-stripping behavior is unchanged. - `AnalysisContext` gains `forced_dcat_paths: Vec<String>` so the orchestrator can hand it to `merge_discovered`. - `merge_discovered(inferred, discovered, &forced_dcat_paths)` now skips each discovered top-level key whose translated path (`/dcat/<key>`) equals or prefixes any forced path. Nested forces (e.g. `/dcat/dcat:contactPoint/vcard:fn`) block the whole-object overlay since `merge_discovered` operates at the top level — nested-leaf force is satisfied by the later pointer-override pass. Scope-limit: force on `package` / `resource` initial-context entries is still accepted and stripped but NOT honored at merge time — that needs a CKAN→DCAT JSON-Pointer mapping table (documented in `load_initial_context`'s comment as a deferred follow-up). USAGE is updated to spell out the new dataset_info behavior and the package/resource gap. Tests: - 3 unit tests on `collect_forced_dataset_info_paths`: dataset_info collection with mixed wrapper / plain / force:false / null-value-force shapes, no-dataset_info, pathological non-object dataset_info. - 4 unit tests on `merge_discovered`: forced top-level key blocks overlay; forced nested path blocks the whole-object overlay; unrelated discovered keys still fill when one is forced; forced paths outside the /dcat subtree are ignored. - 1 integration test exercising the full flow against the qsv binary: initial-context with `{value: "MIT IRI", force: true}` for dct:license (lands via pointer override) and `{value: null, force: true}` for dct:rights (null round-trips, force blocks hypothetical discovery overlay). Verified: 119 profile unit tests pass (was 112, +7); 18 integration tests pass under both -F all_features and -F datapusher_plus (was 17, +1). cargo +nightly fmt + clippy clean, docs/help regenerated, docs-drift-check reports no drift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(profile): RFC 6901 escape discovered keys in merge_discovered (roborev #2469) One Medium finding on the §5.4 commit: the candidate JSON-Pointer path built from each discovered DCAT key was interpolated directly without RFC 6901 token escaping. A user wanting to force a JSON-LD property whose key contains `/` or `~` (full IRIs like `http://purl.org/dc/terms/title`, the rare CURIE-with-tilde) would write the path in its escaped form (`/dcat/http:~1~1purl.org~1dc~1terms~1title`), but our candidate construction produced the un-escaped raw form (`/dcat/http://purl.org/dc/terms/title`) — too many pointer segments, never matches, force is silently ignored. Fix: - New `escape_json_pointer_token` helper that applies RFC 6901 section 4 escaping (`~` → `~0`, `/` → `~1`) in the correct order (`~` first, otherwise the `~1` from a `/` would get double-escaped to `~01`). - `merge_discovered` builds `candidate = format!("/dcat/{}", escape_json_pointer_token(k))` so the comparison stays in the canonical escaped JSON-Pointer space. Tests (3 new in src/cmd/profile.rs::tests): - merge_force_match_handles_full_iri_keys_via_rfc6901_escaping: forced path `/dcat/http:~1~1purl.org~1dc~1terms~1title` correctly blocks the discovered `http://purl.org/dc/terms/title` overlay. - merge_force_does_not_match_unrelated_keys_after_escaping: regression check that the same escaping doesn't over-eagerly match an unrelated `dct:identifier` key. - escape_json_pointer_token_matches_rfc6901: unit-level matrix — plain, /-only, ~-only, the tricky `~/` ordering trap (must yield `~0~1`, not `~01`), and the full-IRI case. Verified: 122 profile unit tests pass (was 119, +3); 17 integration tests pass under both -F all_features and -F datapusher_plus. cargo +nightly fmt + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * address copilot review: URL-safe sibling candidates + validate flag forwarding * dcat_discover::sibling_candidates: build all four candidates via `url::Url` parsing so query strings and fragments on the input URL don't get baked into the appended suffix. An input like `snapshot.csv?token=abc#frag` was producing `snapshot.csv?token=abc.metadata.json`, which servers interpreted as a GET on the CSV with a polluted query value rather than a fetch of the sibling JSON. Falls back to textual append only when the URL fails to parse. Updated the corresponding test to assert the new behavior for all four candidate slots. * profile::run_profile_validation: forward `--no-headers` and `--delimiter` to `qsv validate` so it parses the input the same way the rest of the profile pipeline (stats/frequency/count) does. Without this, non-default CSV options would yield spurious or missed RFC4180 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(profile): regression test for `validate` --delimiter forwarding (roborev #2471) Roborev flagged the new `--no-headers` / `--delimiter` forwarding path in `run_profile_validation` as uncovered: the existing validation test only exercised default comma-delimited input with headers, so it would still pass if the forwarded args were dropped or misordered. The new test uses a `;`-delimited CSV whose rows contain unquoted commas. When parsed as the default `,`-delimited, field counts mismatch the 1-field header and `qsv validate` emits an RFC4180 record-length failure. When parsed with `;`, the six fields per row line up and validation passes. Asserting the absence of a `qsv:validation` warning on this input proves the `--delimiter ;` flag was forwarded to the spawned `qsv validate`. Verified by running `qsv validate` directly on the same content with and without `--delimiter ;` — exit 1 vs exit 0 respectively, confirming the test would fail if the forwarding were ever removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dependabot Bot added dependencies Pull requests that update a dependency file rust Pull requests that update Rust code labels Jan 22, 2025

jqnatividad merged commit 070b743 into master Jan 22, 2025

jqnatividad deleted the dependabot/cargo/jsonschema-0.28.2 branch January 22, 2025 13:05

BrewTestBot mentioned this pull request Jan 26, 2025

qsv 2.2.0 Homebrew/homebrew-core#205567

Merged

jqnatividad mentioned this pull request May 26, 2026

feat(profile): five §5 follow-ups (sibling-URL discovery, profile-driven validation, force:true, UUID URL-title walk-up, GSA-bundle deferral) #3904

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build(deps): bump jsonschema from 0.28.1 to 0.28.2#2469

build(deps): bump jsonschema from 0.28.1 to 0.28.2#2469
jqnatividad merged 1 commit into
masterfrom
dependabot/cargo/jsonschema-0.28.2

dependabot Bot commented on behalf of github Jan 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dependabot Bot commented on behalf of github Jan 22, 2025

[Python] Release 0.28.2

Fixed

Performance

[Rust] Release 0.28.2

Fixed

Performance

[0.28.2] - 2025-01-22

Fixed

Performance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant