feat(scoresql): SQL file support, DuckDB PATH fallback & QSV_DUCKDB_PATH rename#3616
Merged
Conversation
jqnatividad
commented
Mar 15, 2026
Collaborator
- SQL script files: scoresql now accepts .sql files as input — strips -- comments and scores the last query in the script
- DuckDB PATH fallback: --duckdb no longer requires QSV_DUCKDB_PATH — automatically finds duckdb in PATH if the env var is unset (cross-platform: which on Unix, where on Windows)
- Env var rename: QSV_DB_ENGINE → QSV_DUCKDB_PATH across all source, docs, and config — the old name suggested an engine selector, but the value is always a filesystem path
- Regex optimization: Alias replacement in both get_polars_plan and get_duckdb_plan now uses a single combined alternation regex (one compilation, one pass) instead of per-alias compilation
- String literal safety: DuckDB alias regex skips matches inside single-quoted SQL strings to prevent double-wrapping read_csv_auto() calls in .sql scripts
- New tests: scoresql_invalid_sql and scoresql_sql_script test cases with improved diagnostics
The old name sounded like an engine selector, but the value is always a filesystem path to the DuckDB binary. QSV_DUCKDB_PATH directly communicates what the variable holds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add two integration tests for scoresql: 'scoresql_invalid_sql' verifies that malformed SQL yields a helpful stderr error, echoes the invalid query, includes a syntax hint, and returns a non-zero exit. 'scoresql_sql_script' creates a SQL script with comments and multiple queries, ensures the tool scores the last query, validates JSON output contains numeric score/max_score, and checks there is no SELECT * warning when the last query selects specific columns. These tests improve coverage for SQL error handling and multi-statement/script parsing behavior.
…ng regex - `should_use_duckdb()`: check for non-empty env var instead of checking if the path value contains "duckdb", since QSV_DUCKDB_PATH is a path that may not contain the substring "duckdb" - scoresql comment regex: add `(?m)` multiline flag so `^`/`$` match line boundaries, not just string boundaries - scoresql comment regex: use `LazyLock` to avoid recompiling on every invocation, consistent with the pattern in describegpt.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- scoresql_invalid_sql: capture output once instead of running the command twice (output_stderr + assert_err), check exit status from the same run - scoresql_sql_script: assert command succeeds before parsing stdout to give clear errors instead of opaque JSON parse panics - Use unwrap_or_default() for suggestion string checks to avoid panics on unexpected non-string elements Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-iteration regex compilation + multi-pass replacement with a single alternation regex and one-pass replacement in both get_polars_plan and get_duckdb_plan. Also hoist comment regex to module-level static and use rfind instead of filter+last. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t lists When `alias_pairs` or `replacements` is empty, the combined regex pattern `\b(?:)\b` matches every zero-width word boundary, causing the replacement closure to panic on a missing key lookup. Skip the regex step entirely when there are no aliases/replacements to process. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When --duckdb is used, try QSV_DUCKDB_PATH first (with full validation), then fall back to probing for "duckdb" in PATH. This lets users run --duckdb without configuring the env var if DuckDB is already installed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pes in SQL strings Resolve the full absolute path of `duckdb` via `which` when falling back to PATH lookup, for consistency with the env-var branch which stores an absolute path. Also extend the string-literal regex to handle backslash escapes (`\'`, `\\`) in addition to standard SQL `''` escaping. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace Unix-only `which` command with platform-conditional logic: `which` on Unix, `where` on Windows. Also handle `where` returning multiple lines by taking the first result. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s not to trigger typos test false positive
Contributor
There was a problem hiding this comment.
Pull request overview
Adds SQL script file support to scoresql, improves DuckDB binary discovery, and standardizes the DuckDB path environment variable name across the codebase and documentation.
Changes:
scoresqlaccepts.sqlfiles (strips full-line--comments and scores the last statement) and improves invalid-SQL diagnostics.- DuckDB path env var renamed to
QSV_DUCKDB_PATH, withscoresql --duckdbadding a PATH-based fallback when unset. - Alias replacement optimized to a single combined regex pass; new tests added for invalid SQL and SQL scripts.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/cmd/scoresql.rs |
Implements .sql script support, optimized alias replacement regexes, improved error reporting, and DuckDB PATH fallback resolution. |
tests/test_scoresql.rs |
Adds tests for invalid SQL and SQL script input; minor robustness tweaks to suggestion parsing. |
src/cmd/describegpt.rs |
Renames env var references to QSV_DUCKDB_PATH and updates DuckDB selection/path lookup logic accordingly. |
src/help_markdown_gen.rs |
Updates help-markdown generation test expectations for the renamed env var. |
docs/help/scoresql.md |
Documents .sql script usage and updates --duckdb env var naming. |
docs/help/describegpt.md |
Updates env var naming in generated help docs for describegpt. |
docs/ENVIRONMENT_VARIABLES.md |
Renames and updates the DuckDB env var entry. |
docs/Describegpt.md |
Updates the DuckDB env var name in narrative documentation. |
docs/describegpt/nyc311-describegpt-prompt.md |
Updates example to use QSV_DUCKDB_PATH. |
dotenv.template |
Updates the example DuckDB env var name. |
.claude/skills/qsv/qsv-scoresql.json |
Updates the tool metadata to reference QSV_DUCKDB_PATH and adds a .sql example. |
- ENVIRONMENT_VARIABLES.md: clarify that describegpt auto-switches to DuckDB when env var is set, while scoresql requires --duckdb flag - Describegpt.md: remove stale "path containing duckdb" condition - help/scoresql.md: update --duckdb description for PATH fallback Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.