Skip to content

feat(scoresql): SQL file support, DuckDB PATH fallback & QSV_DUCKDB_PATH rename#3616

Merged
jqnatividad merged 13 commits into
masterfrom
scoresql-sqlfile-support
Mar 15, 2026
Merged

feat(scoresql): SQL file support, DuckDB PATH fallback & QSV_DUCKDB_PATH rename#3616
jqnatividad merged 13 commits into
masterfrom
scoresql-sqlfile-support

Conversation

@jqnatividad

Copy link
Copy Markdown
Collaborator
  • SQL script files: scoresql now accepts .sql files as input — strips -- comments and scores the last query in the script
  • DuckDB PATH fallback: --duckdb no longer requires QSV_DUCKDB_PATH — automatically finds duckdb in PATH if the env var is unset (cross-platform: which on Unix, where on Windows)
  • Env var rename: QSV_DB_ENGINE → QSV_DUCKDB_PATH across all source, docs, and config — the old name suggested an engine selector, but the value is always a filesystem path
  • Regex optimization: Alias replacement in both get_polars_plan and get_duckdb_plan now uses a single combined alternation regex (one compilation, one pass) instead of per-alias compilation
  • String literal safety: DuckDB alias regex skips matches inside single-quoted SQL strings to prevent double-wrapping read_csv_auto() calls in .sql scripts
  • New tests: scoresql_invalid_sql and scoresql_sql_script test cases with improved diagnostics

jqnatividad and others added 11 commits March 15, 2026 10:25
The old name sounded like an engine selector, but the value is always
a filesystem path to the DuckDB binary. QSV_DUCKDB_PATH directly
communicates what the variable holds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add two integration tests for scoresql: 'scoresql_invalid_sql' verifies that malformed SQL yields a helpful stderr error, echoes the invalid query, includes a syntax hint, and returns a non-zero exit. 'scoresql_sql_script' creates a SQL script with comments and multiple queries, ensures the tool scores the last query, validates JSON output contains numeric score/max_score, and checks there is no SELECT * warning when the last query selects specific columns. These tests improve coverage for SQL error handling and multi-statement/script parsing behavior.
…ng regex

- `should_use_duckdb()`: check for non-empty env var instead of checking
  if the path value contains "duckdb", since QSV_DUCKDB_PATH is a path
  that may not contain the substring "duckdb"
- scoresql comment regex: add `(?m)` multiline flag so `^`/`$` match
  line boundaries, not just string boundaries
- scoresql comment regex: use `LazyLock` to avoid recompiling on every
  invocation, consistent with the pattern in describegpt.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- scoresql_invalid_sql: capture output once instead of running the command
  twice (output_stderr + assert_err), check exit status from the same run
- scoresql_sql_script: assert command succeeds before parsing stdout to
  give clear errors instead of opaque JSON parse panics
- Use unwrap_or_default() for suggestion string checks to avoid panics
  on unexpected non-string elements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-iteration regex compilation + multi-pass replacement with
a single alternation regex and one-pass replacement in both
get_polars_plan and get_duckdb_plan. Also hoist comment regex to
module-level static and use rfind instead of filter+last.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t lists

When `alias_pairs` or `replacements` is empty, the combined regex pattern
`\b(?:)\b` matches every zero-width word boundary, causing the replacement
closure to panic on a missing key lookup. Skip the regex step entirely
when there are no aliases/replacements to process.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When --duckdb is used, try QSV_DUCKDB_PATH first (with full validation),
then fall back to probing for "duckdb" in PATH. This lets users run
--duckdb without configuring the env var if DuckDB is already installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pes in SQL strings

Resolve the full absolute path of `duckdb` via `which` when falling back
to PATH lookup, for consistency with the env-var branch which stores an
absolute path. Also extend the string-literal regex to handle backslash
escapes (`\'`, `\\`) in addition to standard SQL `''` escaping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace Unix-only `which` command with platform-conditional logic:
`which` on Unix, `where` on Windows. Also handle `where` returning
multiple lines by taking the first result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds SQL script file support to scoresql, improves DuckDB binary discovery, and standardizes the DuckDB path environment variable name across the codebase and documentation.

Changes:

  • scoresql accepts .sql files (strips full-line -- comments and scores the last statement) and improves invalid-SQL diagnostics.
  • DuckDB path env var renamed to QSV_DUCKDB_PATH, with scoresql --duckdb adding a PATH-based fallback when unset.
  • Alias replacement optimized to a single combined regex pass; new tests added for invalid SQL and SQL scripts.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/cmd/scoresql.rs Implements .sql script support, optimized alias replacement regexes, improved error reporting, and DuckDB PATH fallback resolution.
tests/test_scoresql.rs Adds tests for invalid SQL and SQL script input; minor robustness tweaks to suggestion parsing.
src/cmd/describegpt.rs Renames env var references to QSV_DUCKDB_PATH and updates DuckDB selection/path lookup logic accordingly.
src/help_markdown_gen.rs Updates help-markdown generation test expectations for the renamed env var.
docs/help/scoresql.md Documents .sql script usage and updates --duckdb env var naming.
docs/help/describegpt.md Updates env var naming in generated help docs for describegpt.
docs/ENVIRONMENT_VARIABLES.md Renames and updates the DuckDB env var entry.
docs/Describegpt.md Updates the DuckDB env var name in narrative documentation.
docs/describegpt/nyc311-describegpt-prompt.md Updates example to use QSV_DUCKDB_PATH.
dotenv.template Updates the example DuckDB env var name.
.claude/skills/qsv/qsv-scoresql.json Updates the tool metadata to reference QSV_DUCKDB_PATH and adds a .sql example.

Comment thread docs/ENVIRONMENT_VARIABLES.md Outdated
Comment thread docs/Describegpt.md Outdated
Comment thread src/cmd/scoresql.rs
Comment thread src/cmd/scoresql.rs
Comment thread src/cmd/scoresql.rs
Comment thread docs/help/scoresql.md Outdated
Comment thread .claude/skills/qsv/qsv-scoresql.json
- ENVIRONMENT_VARIABLES.md: clarify that describegpt auto-switches to
  DuckDB when env var is set, while scoresql requires --duckdb flag
- Describegpt.md: remove stale "path containing duckdb" condition
- help/scoresql.md: update --duckdb description for PATH fallback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jqnatividad jqnatividad merged commit 1aa0fe4 into master Mar 15, 2026
27 checks passed
@jqnatividad jqnatividad deleted the scoresql-sqlfile-support branch March 15, 2026 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants