07 Jun 15:08

25696b0

21.0.0 Latest

Latest

[21.0.0] - 2026-06-08 🌐 The "F-AI-Rification" Release 📇

FAIR Data is AI-Ready Data. It is the perfect context for AI applications - as its compact, token-efficient and vastly improves an Agent's understanding of your Data. A few hundred kilobytes of FAIR metadata is often all it takes to comprehensively describe giga/terabyte level data.

This major release builds even further on qsv's existing FAIRification capabilities, with two new commands and expanded geocoding:

profile - extracts standards-compliant dataset FAIR metadata (DCAT-US v3, DCAT-AP v3, Croissant 1.1 and Geoconnex);
get - fetches tabular data from HTTP(S), cloud object stores (AWS S3/Google Cloud/Microsoft Azure) and CKAN portals into a content-addressed local cache - making it even easier to FAIRify remote data and/or to use these remote data to enrich your data corpus;
geocode goes online with OpenCage. Geospatially contextualize and normalize your location data with OpenCage. Unlike other geocoders - OpenCage is built from open data; has the most permissive licensing - allowing displaying data on ANY Map and indefinite caching; and is much cheaper to boot!

qsv 21.0.0 raises the minimum supported Rust version to 1.96 and upgrades to Polars 0.54, which is why this is a major version bump - existing pipelines are otherwise source-compatible.

Highlights

profile - generate standards-compliant dataset metadata.
A new command that profiles a dataset and projects it into open metadata standards — DCAT-US v3, DCAT-AP v3, Croissant 1.1, and Geoconnex — via configurable YAML-driven MiniJinja-powered projection engine, with optional SHACL/mlcroissant/pyshacl validation and embedded descriptive statistics & frequency tables so you can further customize the metadata schema mappings (#3898, #3901, #3908, #3912, #3916, #3918).
get - fetch tabular data from anywhere into a local cache.
A new command (issue #2263) that retrieves CSV/TSV and other tabular data from HTTP(S) URLs, cloud object stores (s3://, gs://, az://), and CKAN portals (ckan://), then stores it in a content-addressed disk cache. Cached entries are addressable via a dc:<name> input prefix usable by any other qsv command, carry BLAKE3 + ETag provenance, support TTL/policy controls, and revalidate conditionally (HTTP If-None-Match / 304 Not Modified). Cloud sources are gated behind the opt-in get_cloud sub-feature; streaming, ranged/parallel downloads and a dc: stats cache landed in Phase 3 (#3953, #3958).
geocode with OpenCage support.
New geocode subcommands call the OpenCage geocoding API for forward and reverse geocoding, with a persistent on-disk result cache and %dyncols: support (issue #1295, #3876, #3878).
describegpt describes meaning, not just types.
A richer Semantic Markdown Data Dictionary format for optimized agents & data catalogs; a JSON Schema (draft 2020-12) output format; and LLM-inferred date/datetime content types round out describegpt's semantic-description capabilities (#3933, #3935, #3871, #3884).
Mergeable / variance-bounded sampling in sample
two new sampling modes plus a sketch-IO surface that lets users sample sharded inputs and combine the results without re-reading the whole corpus. Both modes are native Rust implementations written from the original algorithm papers. The Apache DataSketches project's Sampling family implements the same family of algorithms in C++/Java/Python — qsv does not bind to or depend on that code (the datasketches Rust crate doesn't expose Sampling-family sketches), so the on-disk format is qsv-specific and not interoperable with DataSketches serialized sketches.
- --varopt <col>
  variance-bounded weighted reservoir sampling using the A-ExpJ keying scheme of Efraimidis & Spirakis (2006). Each record gets a key u^(1/w) and the top-k keys are retained. Unlike --weighted (which is single-pass acceptance-rejection requiring a max_weight from the stats cache), --varopt is a true reservoir sampler — no stats cache required, single pass, bounded memory, and mergeable across partitions.
- --mergeable-reservoir
  uniform reservoir using Vitter's Algorithm R. Same statistical distribution as the default RESERVOIR method, but the resulting sampler state is mergeable.
- --sketch-out <file> / --sketch-in <file1,file2,...>
  serialize the sampler state to a binary blob and merge across runs. Sketches embed the source CSV header so --sketch-in re-emits a schema-bearing CSV without consulting the source files. Sampler-kind mismatch (mixing a reservoir blob with a varopt blob) is rejected. Works with both new sampling modes.

Detailed MCP Server and Cowork Plugin changes are documented in the MCP Server/Cowork Plugin CHANGELOG.

Added

sample: --varopt <col> flag for variance-bounded weighted reservoir sampling (A-ExpJ keying, Efraimidis & Spirakis 2006). See Headline above.
sample: --mergeable-reservoir flag for a uniform reservoir sampler whose state is mergeable across runs (same distribution as the default RESERVOIR method). See Headline above.
sample: --sketch-out <file> / --sketch-in <files> for serializing and merging sampler state across runs. Sketches carry their source CSV header so merged output is schema-bearing.
geocode: new cache-clear, cache-prune & cache-info subcommands to manage the persistent on-disk OpenCage result cache. cache-clear wipes the cache, cache-prune --older-than <val> deletes entries older than an absolute date or a relative age (e.g. 30d, 2w), and cache-info reports the cache directory, entry count, on-disk size and oldest/newest entry timestamps.
profile: new bundled geoconnex projection profile + pyshacl validator wired to the Internet of Water's Geoconnex SHACL shapes (vendored under resources/geoconnex/shacl/, embedded in the qsv binary). Phase 1 is dataset-level only — DatasetShape / ProviderShape / PublisherShape / DistributionShape coverage; the row-per-feature LocationOrientedShape (with mandatory gsp:asWKT geometry synthesis from lat/lon columns) is deferred to a follow-up. Gated behind a new geoconnex cargo feature — present in qsv (via distrib_features) and as an opt-in for qsvdp (-F datapusher_plus,geoconnex); not available in qsvlite / qsvmcp.
🆕 get: new command for fetching tabular data from HTTP(S) URLs, cloud object stores (s3:///gs:///az://) and CKAN portals (ckan://) into a content-addressed local disk cache. Cached entries are reusable by any other qsv command via the dc:<name> input prefix, carry BLAKE3/ETag provenance plus record-count and TTL metadata, and revalidate conditionally over HTTP (If-None-Match → 304 Not Modified). Subcommands include cache-set-ttl, cache-set-policy and cache-list --verify. Cloud sources are gated behind the opt-in get_cloud sub-feature (via object_store, no new transitive crates). Available in qsv/qsvmcp/qsvdp (not qsvlite). Issue #2263 (#3953, #3958).
🆕 profile: new command for profiling a dataset and projecting it into open metadata standards — DCAT-US v3, DCAT-AP v3 and Croissant — through a YAML-driven projection engine, with optional external validation (mlcroissant for Croissant, pyshacl for DCAT-AP/Geoconnex SHACL shapes) and embedded descriptive statistics & frequency tables. Accepts local files, URL inputs and stdin. Available in qsv, qsvmcp and qsvdp; not in qsvlite. (The bundled geoconnex projection profile is the only part gated further — to qsv/qsvdp via the geoconnex feature.) (#3898, #3901, #3904, #3908, #3910, #3911, #3912, #3918).
geocode: new OpenCage online geocoding subcommands for forward and reverse geocoding via the OpenCage API, including %dyncols: support to materialize multiple result fields as new columns. Issue #1295 ([#3876](https://gith...

Assets 16

qsv-21.0.0-aarch64-apple-darwin.zip

sha256:2595cc0361e76e2bfc0cbcb649ce09da3f9cf95d9313ef49f8a28ad4be39d84e

212 MB 2026-06-08T14:28:00Z
qsv-21.0.0-aarch64-pc-windows-msvc.zip

sha256:71fa22e67c7b7886f983a86ab38d96c76d1d31a501248f3fbb957505dec9ae34

46.5 MB 2026-06-07T16:18:05Z
qsv-21.0.0-aarch64-unknown-linux-gnu.zip

sha256:ba514f65fa96be4e39a8c30a85546d29a637895e8475059f0bc33ad0a4fc791e

36.1 MB 2026-06-08T10:59:49Z
qsv-21.0.0-geocode-index.rkyv

sha256:ccfad53ecb5d7be38afc24538c695232bacc72dc0c21ff4c8d71b0c2af4d1a37

22.4 MB 2026-06-07T15:08:24Z
qsv-21.0.0-geocode-index.rkyv.cities15000

sha256:ccfad53ecb5d7be38afc24538c695232bacc72dc0c21ff4c8d71b0c2af4d1a37

22.4 MB 2026-06-07T15:08:23Z
qsv-21.0.0-geocode-index.rkyv.cities15000.sz

sha256:eeb32525f79edf56d3fa7b1a03143c6c132b8e749871bf2292c5ec9a4b67917b

9.02 MB 2026-06-07T15:08:22Z
qsv-21.0.0-powerpc64le-unknown-linux-gnu.zip

sha256:a8cfc55d9ee887210d8ad375a23644233eb8419467129b3b078f26af6da30960

22.1 MB 2026-06-07T15:57:20Z
qsv-21.0.0-s390x-unknown-linux-gnu.zip

sha256:c2d603bcac0b9dac5d09d36d744e855b0a5f47a444f2a5fb654b97eabfc49858

23.8 MB 2026-06-07T15:55:20Z
qsv-21.0.0-x86_64-pc-windows-gnu.zip

sha256:1bd7c83eef7bc48cd67ef69b4477fa02707b6cde97c7bf0cd0f71a5949640cf6

83.8 MB 2026-06-08T12:47:37Z
qsv-21.0.0-x86_64-pc-windows-msvc.zip

sha256:3d94074377be3f95611c8ad038fd7a9cd5906fc7ef6cd33daa8ce90d7b83ce88

249 MB 2026-06-08T15:48:31Z
Source code (zip)

2026-06-07T15:07:35Z
Source code (tar.gz)

2026-06-07T15:07:35Z

18 May 03:45

jqnatividad

20.1.0

8a66ed5

20.1.0

[20.1.0] - 2026-05-18 🤖 The "Synthetic Data" Release 🎲

A feature-packed minor release headlined by a brand-new synthesize command for generating realistic fake CSV data, a much smarter describegpt that can now describe what your columns mean (not just their data types), and new "approximate stats" modes that let stats and frequency keep working on files that are much bigger than your computer's memory. No breaking changes — pipelines built on 20.0.0 will upgrade in place.

Highlights

🆕 synthesize — generate realistic fake CSVs from a real one. Point it at a source file and it produces a new CSV of any size whose columns look and behave like the original — same value mix, same distribution shape, same null rate — but without any of the original records. Useful for sharing test data, populating staging environments, or building demos without leaking real customer data.
- Categorical columns (e.g. country, status) are rebuilt by sampling the real values in the same proportions they appear.
- Numeric and date columns preserve the shape of the distribution, not just the min/max — so the synthetic data has realistic clusters, not a flat random spread.
- Null rates are matched per column.
- --seed makes output fully reproducible — same seed, same file, every time.
- --dictionary / --infer-content-type plugs in the new describegpt Content Types (see next bullet) so columns recognized as e.g. email, phone, city, or credit_card are filled with realistic-looking fakes instead of generic random strings. --locale picks from 14 regional flavors (US, FR, JP, etc.) so the fakes match your audience.
- Cross-column correlations (e.g. keeping city ↔ zip_code consistent within a row) aren't modeled by default — but turning on describegpt's --two-pass option (see next bullet) lets the LLM detect related fields, and synthesize will then keep those relationships consistent in the generated rows.
🧠 describegpt got a lot smarter — it can now label what your columns mean. In addition to qsv's existing type detection (Integer, Float, Date, etc.), describegpt can now ask an LLM to classify each column with a semantic label from a 47-token vocabulary covering people, addresses, companies, technical identifiers, and more — so a column of strings isn't just "String", it's email, street_address, job_title, or credit_card. These labels are what powers synthesize's realistic fakes, but they're also useful on their own as auto-generated data dictionaries.
- --two-pass runs the LLM a second time over the ENTIRE Data Dictionary so it can spot relationships between columns (e.g. "this is a state_abbr because the next column is a zip_code") and fix sloppy first-pass labels. This is also what unlocks cross-column consistency in synthesize (see previous bullet).
- Deterministic unique_id tag — columns where every value is unique (like IDs and UUIDs) are tagged by qsv directly, before the LLM ever sees them. That means the label is 100% reproducible and doesn't drift between LLM versions.
- Smarter time/duration handling — duration columns can carry a realistic upper bound (e.g. "0–1 hour") so synthetic latency or TTL values stay believable instead of ranging out to absurd numbers.
- --markdown-template lets you customize the generated Data Dictionary's Markdown output — add your team's review checklist, restructure the per-column layout, whatever fits your docs.
- Lower LLM costs — the default prompts were restructured to stop re-sending the dictionary on every step, measurably cutting token usage on multi-phase runs.
📊 Approximate stats for huge files — stats and frequency no longer give up when a file is much bigger than your RAM. New opt-in modes use Apache DataSketches algorithms that compute approximate-but-bounded-error answers in a tiny fraction of the memory. Three new modes across two commands:
- For stats: --quantile-method tdigest for approximate percentiles (t-digest) and --cardinality-method hll for approximate distinct counts (HyperLogLog).
- For frequency: --sketch-method misra-gries for approximate top-K most-frequent values (Misra-Gries Frequent Items).
- Automatic when you'd otherwise OOM: if qsv detects the file is too big to fit in memory, it now auto-switches to the approximate modes (and tells you which ones), instead of failing. Pass --quantile-method exact (etc.) to force the precise calculation regardless.
- Cache stays correct: the stats cache key now includes the chosen mode, so switching between exact and approximate modes won't accidentally return stale results.
- Note: these modes require a "little-endian" CPU, which covers all common hardware (Intel, AMD, Apple Silicon, ARM, etc.). Exotic platforms like IBM s390x get a clear error message instead.

Detailed MCP Server and Cowork Plugin changes are documented in the MCP Server/Cowork Plugin CHANGELOG.

Added

synthesize: new top-level command (see Headline) #3854
synthesize: --consistent-fakes for stable source→fake mapping #3865
synthesize: --locale option for 14 fake-rs locales #3860
describegpt: --two-pass cross-field Data Dictionary refinement #3863
describegpt: deterministic unique_id Content Type for ALL_UNIQUE fields #3862
describegpt,synthesize: infer Content Type for temporal fields with LLM-hinted duration cap #3861
describegpt,synthesize: 5 new Content Type tokens — street_name, license_plate, industry, profession, ipv6_address
describegpt: --markdown-template for customizable Markdown output #3834
pivotp: --agg quantile@<p> (alias q@<p>) with linear interpolation #3842
stats/frequency: opt-in Apache DataSketches modes — HLL cardinality, Frequent Items top-K #3840
stats: widened BLAKE3 fingerprint to cover all streaming stats #3824

Changed

stats/frequency: auto-enable Apache DataSketches estimators (t-digest + HyperLogLog for stats; Misra-Gries Frequent Items for frequency) when util::mem_file_check reports OOM, in addition to the existing auto-index fallback. A wwarn! is emitted listing the auto-enabled estimators; explicit --quantile-method exact / --cardinality-method exact / --sketch-method exact still suppresses the auto-enable #3843
stats: three opt-in micro-optimizations — simdutf8 output, t-digest quantiles, mode-cardinality cap #3839
synthesize: use string-length stats for unstructured text columns #3864
describegpt: inline {{ dictionary }} in default description/tags prompts; skip redundant chat-message dictionary injection when the template already inlines it
synthesize: handle both describegpt-wrapped and raw dictionary JSON
refactor: adopt Rust 1.95 cfg_select! macro at platform-conditional sites #3846
perf: promote bytes_to_cow_str helper to util and sweep callsites
perf(moarstats): hint rare branches with core::hint::cold_path() #3823
perf(stats): mark non-UTF-8 branch cold
perf(frequency): hint UTF-8 failure as cold in the ignore-case hot loop #3821
refactor(stats): shrink and tidy WhichStats #3822
refactor(publish): fetch tags and enforce SemVer for debian package releases
refactor(benchmarks): harden benchmarks.sh error handling and cross-platform support #3814
deps: bump polars (latest upstream), calamine 0.34→0.35, csvlens fork with bumped arrow, sysinfo 0.38.4→0.39.2, rust_decimal 1.41→1.42, tokio 1.52.1→1.52.3, filetime 0.2.27→0.2.29, jsonschema 0.46.4→0.46.5, rand_xoshiro 0.8.0→0.8.1, redis 1.2.0→1.2.1, qsv-dateparser 0.14→0.15 (adds support for ISO 8601 T-separated datetimes without a timezone suffix — e.g. 2020-01-15T08:00:00, the form produced by Python's datetime.isoformat() without astimezone(); previously misclassified by qsv stats --infer-dates as String)
assorted clippy cleanups across stats, frequency, pivotp, partition

Fixed

stats: preserve length & lex stats when column type widens to String #3856
stats: remove duplicate big-endian TDigestStub/HllSketchStub defs #3857
stats: restore big-endian build by giving slot fallbacks an accessible .0 #3850
stats/frequency: gate Apache DataSketches behind little-endian targets #3847
apply/applydp: thousands negative fractions; scope <NULL> to regex_replace #3845
moarstats: retry on stats coverage mismatch + fsync joined CSV parent dir [#3838](https://github.com/da...

Assets 16

03 May 03:42

jqnatividad

20.0.0

2211ad4

20.0.0

[20.0.0] - 2026-05-03 🧹 The "Spring Cleaning" Release 🌱

Over the past four weeks, we did the first end-to-end pass over the qsv codebase using Claude Code, roborev, Serena, Context7 and GitHub Copilot orchestrated using a multi-agent, adversarial review workflow - systematically auditing every command for correctness, safety, and performance. The result is the largest correctness-and-safety sweep in qsv's history: ALL commands were touched by review-driven cleanups, with dozens of latent bugs, panic paths, and performance cliffs swept out, while adding more than 250 new tests across the board.

This is a major version bump because that sweep also surfaced four user-visible behaviors that were demonstrably wrong and could not be fixed without breaking compatibility:

safenames verify-mode now correctly counts duplicate-suffix renames as unsafe (previously under-reported).
enum --hash is now collision-resistant across multi-column inputs (previously ["ab","c"] and ["a","bc"] hashed identically).
excel --metadata csv column ordering now actually matches its header row (previously the type, visible, and headers columns held each other's values).
util::safe_header_names now enforces its 60-char cap in bytes end-to-end (previously chars-based, allowing UTF-8 names up to 240 bytes — past Postgres' maximum identifier length).

Plus a few smaller but breaking corrections: headers --intersect is renamed to --union (the flag never computed an intersection), luau qsv_loadcsv headers are now 1-indexed per Lua convention, and MSRV is bumped to Rust 1.95.

Beyond the cleanup, this release adds one new top-level command:

NEW implode command: the inverse of explode. Groups rows by key column(s) and joins a value column into a single delimited string per group — useful for collapsing normalized output back into compact form.

And a notable performance win:

frequency: parallel tree-reduce of partial frequency tables delivers a ~1.3x speedup on multi-core machines. Smaller per-command perf wins also landed in fill (+22%), datefmt (+9%), cat, dedup, replace, search/searchset, and transpose.

Detailed MCP Server and Cowork Plugin changes are documented in the MCP Server/Cowork Plugin CHANGELOG.

Important

This is a major release with breaking changes. Pipelines that consume qsv excel --metadata csv by column position, store qsv enum --hash digests across versions, parse qsv safenames verify-mode output, or invoke qsv headers --intersect will need updates. See the Changed and Removed sections below for migration notes.

Added

implode: new command — inverse of explode #3733 (closes #917)
generators: mark required options in help markdown and MCP skills #3734
sortcheck: add --numeric and --natural flags; allocation-free streaming loop #3756
exclude: add stdin support and memcheck #3749

Changed

BREAKING excel: --metadata csv column ordering for type, visible, and headers is corrected. Previously the CSV header row declared type, visible, headers but the data rows pushed values in the order headers, typ, visible, so under each named column the wrong values appeared (the type column held the headers list, visible held the type, and headers held the visibility). The CSV output now matches the --metadata json (SheetMetadata struct) field order: index, sheet_name, type, visible, headers, column_count, …. Pipelines that consumed qsv excel --metadata csv and indexed by column position must shift those three columns; consumers that indexed by header name see corrected values automatically.
BREAKING enum: --hash digest values change. The hashed input now carries a u64 length prefix per field (to fix the multi-column collision bug above), so every --hash digest differs from earlier qsv versions — single-column hashes change identity values too, and stored hashes from earlier qsv versions will not match. Same input still hashes deterministically across rows and runs in ≥ this version.
BREAKING luau: qsv_loadcsv now returns the headers table 1-indexed (per Lua convention). Scripts that accessed headers[0] or iterated for i = 0, #headers - 1 must shift to headers[1] and for i = 1, #headers (or ipairs(headers)). Previously headers[1] returned the second header.
BREAKING headers: rename --intersect to --union. The flag has always computed a deduplicated union of headers across inputs, not a true set intersection — the name was a long-standing misnomer. --intersect is removed entirely (no alias) given the surrounding breaking-change window. Migration: replace qsv headers --intersect … with qsv headers --union …; output is unchanged.
BREAKING safenames: verify-mode (--mode v / V / j / J) outputs change. (1) Verify counts now include header positions that would be renamed by the duplicate-suffix pass — inputs containing duplicate column names will report higher unsafe counts than earlier qsv versions; the count now matches what --mode a would actually rewrite. (2) --mode V / j / J displays unsafe-header strings with leading/trailing whitespace and surrounding " already trimmed (matching what the safe-rename pass actually evaluates), and duplicate_headers is now sorted alphabetically rather than appearing in undefined HashMap iteration order. Pipelines that parsed verbose/JSON output and depended on the old ordering or untrimmed strings must update.
BREAKING util::safe_header_names: the 60-length cap is now enforced in bytes on the final name, including any duplicate-disambiguation suffix. Previously the truncation was chars-based (take(60).sum()) and only applied to the base, so non-ASCII headers could produce up to ~240-byte names and duplicate-disambiguated headers added _<n> after truncation, pushing past Postgres' NAMEDATALEN (63 bytes). Now the rewrite path lowercases and prepends the leading-_ prefix before truncating, then snaps to a UTF-8 char boundary at ≤60 bytes. ASCII-only inputs see the same output as before for non-suffixed cases. Long ASCII headers that previously generated 61–63-char suffixed variants will be 1–2 chars shorter at the boundary. Headers containing multibyte UTF-8 (CJK, accented chars, emoji) that previously produced names >60 bytes will now be aggressively trimmed to fit. Affects every caller (safenames, applydp, apply, fetch, python); stored mappings keyed on the old over-long forms will not match.
describegpt: split process_phase_output into per-branch helpers (dictionary context-only, full dictionary, JSON, TSV, TOON, Markdown). No behavior change — same output, smaller functions.
luau: qsv_coalesce now stringifies non-string values (numbers and booleans render via to_string; nil / arrays / objects are skipped). Previously, numbers and booleans were silently treated as missing values via as_str().unwrap_or_default(). Scripts relying on qsv_coalesce(some_bool, fallback) to skip booleans will now return "true"/"false" for the boolean.
describegpt: per-phase helper split, widened cache key, ~21% LOC reduction #3720 #3721 #3722
frequency: parallel tree-reduce of partial FTables (~1.3x speedup) #3728
moarstats: collapse duplicated outlier bivariate scan; safety/perf cleanup, unit tests #3718 #3719
validate: use cold_hint (stabilized in Rust 1.95) #3717; correctness, perf cleanup #3743 #3779
frequency: correctness, perf, refactor cleanup #3745
apply: review-driven cleanup, perf #3741
template: subdir bug fix, lookup perf, render-error visibility, helper extraction #3740
dedup: allocation-free ignore-case #3754
datefmt: ~9% perf #3753
fill: ~22% faster hot path #3762
replace: streaming parallel write; dead match-flag tracking #3777
search/searchset: parallel memory streaming; --quick fixes; USAGE alignment #3776
cat: rowskey speedup #3750
transpose: correctness, perf cleanup, polish #3781
cleanup: rename fail_oom_clierror; surface geocode update-check error #3806
applied select clippy lints

Fixed

excel: review-driven cleanup of src/cmd/excel.rs — fix four correctness bugs. (1) Negative --sheet indices that overshot the sheet count silently selected a wrong sheet because the abs_diff clamp "bounced" past zero (e.g. --sheet -4 on a 3-sheet workbook returned the 2nd sheet); now errors with usage error: negative sheet index N is out of range for K sheets. (2) get_requested_range l...

Assets 16

13 Apr 02:56

jqnatividad

19.1.0

7702c79

19.1.0

[19.1.0] - 2026-04-13

Note

Self-update for the pre-built binaries was broken in qsv 18.0.0 and 19.0.0. This was caused by a bug in the self-update crate that has since been fixed.
WORKAROUND: Download qsv 17.0.0 which predates the self-update bug, and use its --update or --updatenow options to upgrade to the latest release.

Detailed MCP Server and Claude Cowork Plugin changes are documented in the MCP Server/Cowork Plugin CHANGELOG.

Added

pivotp: add group-by mode #3698 (closes #3697)
pivotp: expand smart aggregation with 7 more statistics #3699

Changed

self_update: show actual error message when available if self_update errors out
moarstats: use fused multiply add for theil_sum (perf)
Switch to crates.io mimalloc, removing git override
Add HTML anchors to some stats definitions

Fixed

Fix 10 documentation-codebase drifts found by audit #3689
Fix 10 documentation-codebase drifts found by audit #3692
Document index support for describegpt and join
Use latest upstream self_update (our PR merged)
Homebrew qsv distribution enables more features now

Dependencies

Bump polars to latest upstream
Ensure all polars_sql features are enabled
Bump jsonschema from 0.45.1 to 0.46.0 #3695
Bump pragmastat from 12.0.1 to 12.1.0 #3693
Bump qsv-stats from 0.48.1 to 0.48.2 #3702
Bump rand from 0.10.0 to 0.10.1 #3700
Bump tokio from 1.51.0 to 1.51.1 #3691
Use nightly-2026-04-01 (same as polars)
bump indirect dependencies

Full Changelog: 19.0.0...19.1.0

Assets 16

06 Apr 15:12

jqnatividad

19.0.0

67e7093

19.0.0

[19.0.0] - 2026-04-07 🔐 The "FAIR Answers" Release 📐

The Reproducibility Crisis in Scientific Research is one of the principal motivators for FAIR Principles in Data Management.

With AI increasingly used in data pipelines, the need for reproducibility and auditability has become even more critical as "hallucinations" and non-deterministic outputs are inherent challenges in Generative AI.

That's why in this release, we instrumented qsv with several features to help users track, audit, and reproduce their AI-assisted data wrangling workflows more effectively. As FAIR Principles do not only apply to data, we also want "FAIR Answers" - with the last R for "Reproducible":

Enhanced Logging: The qsv_log tool now supports structured logging with JSON output, making it easier to parse and analyze logs for reproducibility audits (note that this is only available from the qsv MCP Server).
NEW blake3 Command: A new blake3 command computes BLAKE3 hashes of files or data streams, providing a fast and reliable way to verify data integrity and track file versions in workflows. Unlike the oft-used SHA-256 hash, BLAKE3 is up to 16 times faster without sacrificing security, making it ideal for large datasets and iterative processing.
Cowork Project Reproducibility Manifest: Building on the Cowork Project support released in 18.0.0, the qsv Cowork Plugin now creates a Project Reproducibility Manifest - a structured log of all prompts, commands, and outputs generated during a Cowork session. This manifest can be used for detailed audits of the data wrangling process, helping users understand how specific outputs were derived and enabling them to reproduce or modify the workflow with confidence.
Even Moarstats: The moarstats command gets even "moar" statistical tests and metrics (Trimean, Midhinge, Robust CV, Jarque-Bera, Theil Index, Mean Absolute Deviation and Simpson's Diversity Index), giving users deeper insights into their data distributions and relationships, which can be crucial for reproducibility in data analysis.
To Parquet Improvements: The to parquet command is re-added with a new implementation powered by Polars' LazyFrame API, providing faster and more reliable CSV-to-Parquet conversion with better schema inference and support for complex data types. New options like --infer-len and --try-parse-dates enhance the accuracy of type inference, further improving the fidelity of Parquet outputs for faster downstream analysis and reproducibility.

Detailed MCP Server and Cowork Plugin changes are documented in the MCP CHANGELOG.

Added

blake3: new BLAKE3 hashing command #3658
to parquet: re-add subcommand powered by Polars #3674
to parquet: pschema.json support, --infer-len and --try-parse-dates #3680
pivotp: totals support #3635
moarstats: even moar stats #3654

Changed

to parquet: use LazyFrame for parquet conversion #3679
tojsonl: implement proper JSONL writer instead of abusing CSV writer
Document first-N sampling; use to_string_lossy
help: suppress linebreaks for options by using non-breaking hyphens #3662
Switch default allocator from mimalloc to jemalloc - the default allocator of polars #3684
Add debug_assert! to moarstats map lookups
Remove some unwraps

Fixed

docs: fix 27 stale claims found in documentation audit #3637
docs: correct 5 documentation inaccuracies found during audit
typo: | character not escaped, prematurely truncating content

Dependencies

bump atoi simd and sysinfo #3663
bump cached from 0.58.0 to 0.59.0 by @dependabot[bot] in #3639
bump file-format from 0.28.0 to 0.29.0 by @dependabot[bot] in #3649
bump human-panic from 2.0.6 to 2.0.7 by @dependabot[bot] in #3661
bump human-panic from 2.0.7 to 2.0.8 by @dependabot[bot] in #3670
bump indexmap from 2.13.0 to 2.13.1 by @dependabot[bot] in #3671
bump jaq from 2 to 3; jaq-json from 1 to 2 #3653
bump jsonschema from 0.45.0 to 0.45.1 by @dependabot[bot] in #3685
bump lodash from 4.17.23 to 4.18.1 in /.claude/skills by @dependabot[bot] in #3669
bump minijinja from 2.18.0 to 2.19.0 by @dependabot[bot] in #3666
bump minijinja-contrib from 2.18.0 to 2.19.0 by @dependabot[bot] in #3665
bump path-to-regexp from 8.3.0 to 8.4.0 in /.claude/skills by @dependabot[bot] in #3652
bump polars to latest upstream at the time of release (rev efe654e)
bump pyo3 from 0.28.2 to 0.28.3 by @dependabot[bot] in #3667
bump redis from 1.0.5 to 1.1.0 by @dependabot[bot] in #3636
bump redis from 1.1.0 to 1.2.0 by @dependabot[bot] in #3677
bump rust_decimal from 1.40.0 to 1.41.0 by @dependabot[bot] in #3648
bump rustls-webpki from 0.103.9 to 0.103.10 by @dependabot[bot] in #3632
bump self_update from 0.43.1 to 0.44.0 by @dependabot[bot] in #3683
bump semver from 1.0.27 to 1.0.28 by @dependabot[bot] in #3678
bump tokio from 1.50.0 to 1.51.0 by @dependabot[bot] in #3672
bump toml from 1.0.7 to 1.1.0 by @dependabot[bot] in #3640
bump toml from 1.1.0 to 1.1.1 by @dependabot[bot] in #3660
bump toml from 1.1.1 to 1.1.2 by @dependabot[bot] in #3664

Full Changelog: 18.0.0...19.0.0

Contributors

dependabot

Assets 16

20 Mar 15:04

jqnatividad

18.0.0

cdaf62b

18.0.0

[18.0.0] - 2026-03-20 The "StatsSighting" Cowork Plugin Release

"StatsSighting" is like "VibeCoding" but for iterative, blazing-fast, deep data analysis. "Stats" for Statistics. "Sight" for Insight - doing a comprehensive statistical profile of datasets first to inform the analysis pipeline.

The Claude Cowork Plugin comes with several agents - the "Data Analyst Agent" for deep data exploration and analysis, the "Data Wrangler Agent" for transformation and cleaning, and the "Policy Analyst Agent" for helping with policy evaluation and decision-making. Each agent has a specific role and skill set, with a shared emphasis on leveraging the qsv MCP Server's profiling and querying capabilities to understand the data before acting on it.

The qsv MCP server received major enhancements - including session logging, DuckDB-powered Parquet conversion, SQL translation hardening, and interactive working directory elicitation.

The core qsv suite also gets significant updates in this release, including the new scoresql command for pre-query SQL analysis, smarter pragmastat with stats-cache integration and comparison mode, pivotp optimizations with moarstats awareness, and formatted table output for to.

Major Features

New `scoresql` Command

Analyze SQL queries against CSV file caches (stats, moarstats, frequency) to produce a performance score with actionable optimization suggestions before running the query. Scoring factors include query plan analysis (EXPLAIN), type optimization, join key cardinality, filter selectivity, anti-pattern detection (SELECT *, missing LIMIT, cartesian joins), and infrastructure checks (index files, cache freshness). Supports Polars and DuckDB modes, SQL file input, and JSON output. Integrates with describegpt for AI-assisted query review. #3612, #3616, #3624

Smarter `pragmastat` — Stats-Cache Aware with Comparison Mode

pragmastat now reads the stats cache to automatically skip non-numeric/non-date columns, and writes its own results back to the cache for downstream commands. New --compare1 and --compare2 options let you compare two distributions side-by-side. Multiple performance optimizations make it significantly faster. #3591, #3593, #3596, #3595, #3611

`pivotp` — Smarter Pivoting with moarstats

pivotp now integrates with moarstats to auto-validate pivot column cardinality before execution, preventing overly wide output (>1000 columns) and guiding users toward better pivot strategies. #3606

`to` — Named Table Support

The to command gains a --table option for CSV, XLSX and ODS output, letting you write data to a named sheet/table in workbook formats. #3572, #3580

Detailed MCP changes are documented in the MCP CHANGELOG.

Added

scoresql: new command — score SQL queries for safety, complexity and performance #3612
scoresql: SQL file support, DuckDB PATH fallback & QSV_DUCKDB_PATH rename #3616
to: add --table option for CSV, XLSX and ODS output #3572, #3580
searchset: ignore line comments in regexset files #3622
pragmastat: add --compare1 and --compare2 options #3591
pragmastat: use stats cache to only process numeric/date/datetime columns #3593
pragmastat: write results to stats cache #3596
pragmastat: multiple performance optimizations #3595, #3611
pivotp: smarter pivoting with moarstats integration #3606
describegpt: scoresql integration #3624

Changed

stats: reduce day-valued precision to 5 decimals #3607
frequency: use array_windows for pairwise comparisons
Use mul_add for numeric ops across the codebase for more accurate FMA
MSRV bumped to latest stable Rust 1.94
Switch csvlens dependency to upstream
Polars bumped to 0.53.0 (py-1.39.x series)

Fixed

stats: fixed big performance regression caused by memory-aware chunking logic error #3598
help: fine-tune markdown generation of docopt usage text #3600

Dependencies

Polars 0.53.0 (py-1.39.3)
pragmastat 11.1.0 → 12.0.0 #3589
qsv-stats 0.47.0 → 0.48.0 #3587
jsonschema 0.44.0 → 0.45.0 #3592
minijinja/minijinja-contrib 2.16.0 → 2.18.0
calamine 0.33 → 0.34
cached 0.58 #3594
Removed patched forks of self_update and pragmastat (upstream releases available)
Various other dependency bumps (toml, toon-format, tempfile, redis, libc, sysinfo, once_cell, spreadsheet-ods)

Full Changelog: 17.0.0...18.0.0

Note

qsv 18.0.0 is not published to crates.io. qsv depends on an unreleased git revision of Polars, and cargo publish strips [patch.crates-io] entries, causing dependency resolution to fail against the published Polars v0.53.0 on crates.io (which caps chrono <=0.4.41, incompatible with chrono 0.4.44). This will be resolved once Polars publishes a new crates.io release with updated chrono support. In the meantime, install qsv via the prebuilt binaries, various package managers, or by building from source.

Assets 16

03 Mar 14:56

jqnatividad

17.0.0

9f11ef1

17.0.0

[17.0.0] - 2026-03-03 "The User 🧑🏻 and Agent 🤖 Experience (UAX) Release"

This release is all about getting Human Users and AI Agents working together in harmony to wrangle data faster and more effectively - whether you're a solo analyst or a data team using Claude Desktop/Cowork/Code or Gemini.

The UAX theme introduced in 16.1.0 reaches full stride — the new qsvmcp binary variant gives AI agents a purpose-built, leaner binary; the MCP server levels up with better tool guidance, TSV output for token efficiency, reproducibility logging, DuckDB-powered Parquet conversion, automatic moarstats enrichment, SQL translation hardening, and interactive working directory elicitation. On the core CLI side, stats cache reliability improves across delimiters and output formats, sniff resolves symlinks correctly, and moarstats gets faster hot-path performance.

Major Features

New `qsvmcp` Binary Variant

A purpose-built binary optimized for use with the qsv MCP server, adding session logging while dispensing with unneeded features (like apply, fetch, fetchpost, foreach, to) for a faster, smaller build. The MCP server now prefers qsvmcp with automatic fallback to the full qsv binary. qsvmcp is now included in release distributions alongside qsv, qsvlite, and qsvdp.

qsv MCP Server: Agent-Native Enhancements

The MCP server (now v17.0.0) receives its biggest update yet, with features designed to make AI agents more effective at data wrangling:

TSV Output Format — Default output switched to TSV for ~30% token reduction in agent responses, configurable via QSV_MCP_OUTPUT_FORMAT
Session Logging — New qsv_log tool and automatic qsvmcp.log audit trail for reproducibility, with configurable log levels via QSV_MCP_LOG_LEVEL
DuckDB Parquet Conversion — When DuckDB is available, CSV-to-Parquet conversion uses DuckDB instead of sqlp for faster, more reliable conversion
Auto-moarstats — moarstats automatically runs after stats execution for richer statistical context at minimal cost
SQL Translation Hardening — Major translateSql overhaul: unique table aliases (_tbl_N), string literal protection, user-provided alias preservation, and pre-scan qualified ref fixing
Working Directory Elicitation — Interactive directory picker via MCP Elicitation protocol for first-time setup
Reserved Cache Filename Guard — Prevents accidental --output overwrites of .stats.csv and .freq.csv cache files
Cache-Aware SQL Guidance — Server instructions now guide agents to leverage stats and frequency caches when composing sqlp, joinp, and pivotp queries
Polars SQL Engine Header — Clear engine indicator differentiates Polars SQL vs DuckDB query results
Absolute Path Resolution — All file-path arguments now resolved to absolute paths for robustness
Cowork CLAUDE.md Auto-Deploy — Automatically deploys project CLAUDE.md to Claude Cowork working folder on session start (cross-platform Node.js implementation)

Detailed MCP changes are documented in the MCP CHANGELOG.

Added

feat: qsvmcp binary variant — purpose-built for MCP server usage, included in release distributions

Changed

perf(moarstats): fix outlier key bug and optimize hot-path allocations
perf(stats): optimize to_record() output path and weighted_mad()
refactor(describegpt): simplify code for clarity and reduce redundancy
deps: bump pragmastat from 10.0 to 11.1.0
deps: bump polars to latest upstream (rev 802550b)
deps: bump Luau from 0.708 to 0.709
deps: bump chrono from 0.4.43 to 0.4.44
deps: bump csv-nose from 0.8.0 to 1.0.1
deps: bump jsonschema from 0.42 to 0.44.0
deps: bump strum/strum_macros from 0.27.2 to 0.28.0
deps: bump tempfile from 3.25.0 to 3.26.0
deps: bump serial_test from 3.3.1 to 3.4.0
deps: bump actions/upload-artifact from 6 to 7
deps: switch csvlens to patched fork using csv-nose 1.0.1
deps: update ort dependency to include tls-rustls feature (by @kulnor)
applied select clippy suggestions

Fixed

fix(stats): always write stats cache as CSV regardless of output format (Snappy, TSV, etc.)
fix(stats): decouple Snappy compression from cache — cache files always use comma delimiter
fix(sniff): resolve symlinks before MIME detection and metadata lookup (#3529)
fix(moarstats): harden outlier test assertion and fix comment inconsistency
fix(describegpt): restore error logging in Redis connection failure
docs: fix ~70 false claims found by documentation audits across qsv and MCP server

Full Changelog: 16.1.0...17.0.0

Note

qsv 17.0.0 is not published to crates.io. qsv depends on an unreleased git revision of Polars (rev = 802550b), and cargo publish strips [patch.crates-io] entries, causing dependency resolution to fail against the published Polars v0.53.0 on crates.io (which caps chrono <=0.4.41, incompatible with chrono 0.4.44). This will be resolved once Polars publishes a new crates.io release with updated chrono support. In the meantime, install qsv via the prebuilt binaries, Homebrew, or by building from source.

Contributors

kulnor

Assets 15

15 Feb 16:53

jqnatividad

16.1.0

376266e

16.1.0

[16.1.0] - 2026-02-15 📊 "The Accelerated Civic Intelligence (ACI) Release" 📊

Statistical analysis gets faster and more robust; User & Agent Experience (UAX) improvements keep the CLI parser, docs, shell completions, and MCP tool definitions in sync from a single source; and the qsv MCP Server gets leaner and smarter.

With a properly configured environment, a User can team up with several AI Agents for accelerated analysis of large, real-world, messy data — raw datasets, presentations, reports, spreadsheets, etc. — without uploading it all to the cloud or manually wrangling it into shape first. Analyzing in a few minutes, what would otherwise take a few days, if not a few weeks to compile.

🌟 Major Features

New `pragmastat` Command

A pragmatic statistical toolkit by @AndreyAkinshin — Compute robust, median-of-pairwise statistics with the Pragmastat library. Designed for messy, heavy-tailed, or outlier-prone data where mean/stddev can mislead. See pragmastat.dev for details on the underlying algorithms and design philosophy.

Frequency Cache System

New --frequency-jsonl option for the frequency command creates a JSONL cache (analogous to stats --stats-jsonl) that accelerates repeated frequency analysis. Uses a hybrid strategy for high-cardinality columns with configurable thresholds.

Improved UAX: Unified Documentation & Shell Completions

A new docopt-based parsing system now generates markdown documentation, shell completions, and MCP tool definitions from the same USAGE text that powers qsv's CLI parsing. Everything stays in sync automatically — no more drift between help text, docs, completions and AI tooling.

--generate-help-md flag produces polished markdown docs with section navigation, emoji legends, clickable URLs, and argument/option tables that are both Human and Agent-friendly.
Shell completions are now auto-generated, replacing 68 manually maintained completion files.

qsv MCP Server: Leaner Architecture

The qsv_pipeline tool has been removed in favor of direct sequential command execution. In practice, agents were already calling commands one at a time, and removing the pipeline abstraction made the server simpler, more predictable, and easier to debug. Additional MCP improvements include:

Extended AI agent guidance to take advantage of frequency and stats caches
Seamless support for Google Gemini CLI thanks to @kulnor's continuing contributions
Major codebase refactoring: deduplicated helpers, extracted filesystem tools, fixed any types, and various bug fixes

Detailed MCP changes are documented in the MCP CHANGELOG for full details.

Added

feat: pragmastat command — pragmatic statistical toolkit with parallelism, progress bar, and memcheck (by @AndreyAkinshin)
feat: frequency --frequency-jsonl — JSONL frequency cache with hybrid strategy for high-cardinality columns
feat: --generate-help-md flag — auto-generate markdown docs from USAGE text with section navigation, emoji legends, and clickable URLs
docs: add QSV_FREQ_HIGH_CARD_THRESHOLD and QSV_FREQ_HIGH_CARD_THRESHOLD_PCT env vars

Changed

perf: stats — skip redundant modes tracking, reduce allocations, optimize cache line layout, deterministic antimode sorting
perf: pragmastat — reduce redundant computations, add parallelism
perf: frequency — use sort_unstable_by for faster sorting; parallel computation for high-cardinality columns
refactor: shell completions auto-generated from USAGE text (removed 68 manual files)
refactor: describegpt — disambiguate "Other" bucket from literal "Other" in Data Dictionary Examples column
deps: bump anstream from 0.6.21 to 1.0.0
deps: bump futures to 0.3.32
deps: bump jsonschema from 0.41 to 0.42
deps: bump libc from 0.2.180 to 0.2.181
deps: bump memmap2 from 0.9.9 to 0.9.10
deps: bump polars to latest upstream
deps: bump pyo3 from 0.28.0 to 0.28.1
deps: bump quickcheck from 1.0.3 to 1.1.0
deps: bump rand from 0.9 to 0.10, rand_hc to 0.5, rand_xoshiro to 0.8
deps: bump sysinfo from 0.37.2 to 0.38.2
deps: bump tempfile from 3.24.0 to 3.25.0
deps: bump toml from 0.9.12 to 1.0.1
deps: bump uuid from 1.20.0 to 1.21.0
deps: bump zmij from 1.0.20 to 1.0.21
deps: update csv patched fork MSRV to 1.93

Fixed

fix: frequency — normalize delimiter for cache compatibility; deterministic output with secondary sort key; hybrid cache for high-cardinality columns
fix: stats — remove unsafe block; deterministic antimode sorting
fix(help): section detection, acronym casing, and option word-wrap in markdown generation

Removed

removed 68 manual shell completion files (now auto-generated from USAGE text)

Full Changelog: 16.0.0...16.1.0

Contributors

kulnor and AndreyAkinshin

Assets 15

09 Feb 04:29

jqnatividad

16.0.0

692fa5e

16.0.0

[16.0.0] - 2026-02-08 🤖 "The AI-Native Release" 🤖

This release makes qsv deeply AI-native — from smarter date detection that flows through to Polars schemas, to a MCP Plugin layer that lets AI agents wield qsv as a first-class data tool.

Claude Desktop, Code, and Cowork users can now use qsv's powerful data-wrangling capabilities directly within their AI workflows, with intelligent guidance and seamless integration. Google Gemini is now also supported thanks to @kulnor.

🌟 Major Features

Smarter Date/DateTime Detection

qsv can now automatically detect date and datetime columns and carry that knowledge through the entire pipeline:

stats --dates-whitelist sniff is now the default — qsv sniffs the first 1000 rows to identify date/datetime field candidates for further guaranteed date/datetime type inferencing
schema auto-detects Date/DateTime columns when generating Polars schemas (.pschema.json)
DateTime type support in Polars schema parsing — temporal types are preserved through sqlp, joinp, and Parquet conversion

Hardened Stats Cache

The stats cache system that accelerates frequency, schema, tojsonl, sqlp, joinp, pivotp, diff, and sample is now more robust:

Simplified API: Removed dataset_stats from get_stats_records(), streamlining all downstream consumers
Safe fallback: Corrupted or unparsable cache files are gracefully handled instead of erroring out
Auto-regeneration: Stats cache regenerates on parse error rather than failing

Enhanced MCP Server (16.0.0)

The qsv MCP Server receives its largest update yet — see MCP CHANGELOG for full details.

Breaking Changes

diff command: --force option removed
- Was used for short-circuiting diffs based on dataset_stats
- No longer needed after stats cache API simplification
to command: parquet subcommand removed
- Use dedicated qsv_to_parquet MCP tool or sqlp for Parquet output

Added

feat: stats — add 'sniff' support for --dates-whitelist
feat: schema — auto-detect Date/DateTime columns for Polars schema via sniff
feat: Support DateTime type in Polars schema parsing

Changed

refactor: stats — make --dates-whitelist sniff the default
perf: Use foldhash HashMap/HashSet across codebase for faster hashing
- Replaces std::collections with foldhash in 14 modules
- foldhash is much faster than std::collections for non-crypto hashing
refactor: stats Remove dataset_stats from stats cache system
- Simplified get_stats_records() API
- Centralized rowcount handling in sample command
- Adapted diff, pivotp, sample, and other commands to new API
refactor: stats Stats cache now regenerates on parse error (improved robustness)
refactor: stats Safe fallback on corrupted stats cache
refactor: pivotp use sparsity for suggestions and uniqueness_ratio for pivot heuristics
refactor: sample lazily compute row_count only for sampling methods that need it
deps: bump async-compression to 0.4.39
deps: bump bytes from 1.11.0 to 1.11.1
deps: bump calamine to 0.33
deps: bump csv-nose from 0.7.0 to 0.8.0
deps: bump csvlens to latest upstream (PR merged)
deps: bump geosuggest to latest upstream
deps: bump flate2 from 1.1.8 to 1.1.9
deps: bump jsonschema from 0.40.0 to 0.41 (latest upstream with unreleased perf improvements)
deps: bump polars from 0.52.0 at py-1.38.1 tag to 0.53
deps: bump pyo3 from 0.27.2 to 0.28.0
deps: bump redis from 1.0.2 to 1.0.3
deps: bump regex from 1.12.2 to 1.12.3
deps: bump reqwest from 0.13.1 to 0.13.2
deps: bump zerocopy from 0.8.35 to 0.8.36
deps: bump zip from 6 to 7
deps: bump zmij from 1.0.17 to 1.0.20
deps: we now bundle Luau 0.708 from 0.706
deps: bump @modelcontextprotocol/sdk (MCP)
applied several clippy lint suggestions
applied several GH Copilot and Claude review suggestions

Fixed

fix: frequency column selection when using --select option in different order
- Now lookup cardinality by column name instead of index
- Handles user-selected/reordered column subsets correctly
fix: sample handle missing min weight in stats cache
fix: validate adapt tests to jsonschema 0.40.2 error message format changes
fix: joinp switch pschema serialization to serde_json for compound type support
fix: excel adjust jsonl path usage caused by calamine 0.33 release
fix: stats return sentinel when sniff finds no date columns
fix: config — QSV_NO_HEADERS environment variable being ignored; split no_headers into explicit setter and CLI flag method

Removed

removed to parquet subcommand in favor of dedicated qsv_to_parquet MCP tool and sqlp Parquet output support
removed cargo install instructions from README as qsv is rarely cargo installable as it uses patched forks on a regular basis and cargo install doesn't support git dependencies.

Full Changelog: 15.0.1...16.0.0

Contributors

kulnor

Assets 15

28 Jan 12:38

jqnatividad

15.0.1

5ba35e7

15.0.1

[15.0.1] - 2026-01-28

Ooops, we celebrated color and the magika-powered revamped sniff but forgot to actually enable them in the release prebuilts! 🤦🏻‍♂️
This patch enables the new color command, turns on magika, along with several fixes and dependency bumps.

Changed

deps: bump polars to latest upstream
deps: bump csv-nose from 0.6.0 to 0.7.0
deps: bump mlua from 0.11.5 to 0.11.6
deps: bump minijinja from 2.14.0 to 2.15.1
deps: bump minijinja-contrib from 2.14.0 to 2.15.1
deps: bump siphasher from 1.0.1 to 1.0.2
deps: bump iana-time-zone from 0.1.64 to 0.1.65
deps: bump hono from 4.11.4 to 4.11.7 (MCP)
build: add color feature to build and test workflows
build: add magika feature to publishing workflows
docs: updated luau documentation to reflect bundled Luau 0.706
docs: sniff is now also 🤖-powered with its use of Magika mime-type detection

Fixed

tests: fix flaky color test_get_theme test (now ignored due to environment dependencies)
tests: fix flaky search JSON test by using semantic rather than byte-by-byte compare

Full Changelog: 15.0.0...15.0.1

Assets 15

Releases: dathere/qsv

21.0.0

[21.0.0] - 2026-06-08 🌐 The "F-AI-Rification" Release 📇

Highlights

Added

Uh oh!

20.1.0

[20.1.0] - 2026-05-18 🤖 The "Synthetic Data" Release 🎲

Highlights

Added

Changed

Fixed

Uh oh!

20.0.0

[20.0.0] - 2026-05-03 🧹 The "Spring Cleaning" Release 🌱

Added

Changed

Fixed

Uh oh!

19.1.0

[19.1.0] - 2026-04-13

Added

Changed

Fixed

Dependencies

Uh oh!

19.0.0

[19.0.0] - 2026-04-07 🔐 The "FAIR Answers" Release 📐

Added

Changed

Fixed

Dependencies

Contributors

Uh oh!

18.0.0

[18.0.0] - 2026-03-20 The "StatsSighting" Cowork Plugin Release

Major Features

New scoresql Command

Smarter pragmastat — Stats-Cache Aware with Comparison Mode

pivotp — Smarter Pivoting with moarstats

to — Named Table Support

Added

Changed

Fixed

Dependencies

Uh oh!

17.0.0

[17.0.0] - 2026-03-03 "The User 🧑🏻 and Agent 🤖 Experience (UAX) Release"

Major Features

New qsvmcp Binary Variant

qsv MCP Server: Agent-Native Enhancements

Added

Changed

Fixed

Contributors

Uh oh!

16.1.0

[16.1.0] - 2026-02-15 📊 "The Accelerated Civic Intelligence (ACI) Release" 📊

🌟 Major Features

New pragmastat Command

Frequency Cache System

Improved UAX: Unified Documentation & Shell Completions

qsv MCP Server: Leaner Architecture

Added

Changed

Fixed

Removed

Contributors

Uh oh!

16.0.0

[16.0.0] - 2026-02-08 🤖 "The AI-Native Release" 🤖

🌟 Major Features

Smarter Date/DateTime Detection

Hardened Stats Cache

Enhanced MCP Server (16.0.0)

Breaking Changes

Added

Changed

Fixed

Removed

New `scoresql` Command

Smarter `pragmastat` — Stats-Cache Aware with Comparison Mode

`pivotp` — Smarter Pivoting with moarstats

`to` — Named Table Support

New `qsvmcp` Binary Variant

New `pragmastat` Command