feat(describegpt): scoresql integration#3624

jqnatividad · 2026-03-17T18:46:54Z

No description provided.

…neration Score LLM-generated SQL queries with `qsv scoresql` before execution, iteratively asking the LLM to improve queries that fall below a quality threshold. This produces better SQL and fewer failed executions. New flags: --no-score-sql, --score-threshold (default 50), --score-max-retries (default 3). Adds 8 integration tests covering polars and DuckDB backends. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…placement in scoresql - Track best SQL as a template with {INPUT_TABLE_NAME} placeholder instead of doing reverse replacement which corrupts SQL when file stem is a common word - Use saturating_add for max_retries loop bound to prevent overflow - Add explicit table name instructions to LLM refinement/error prompts - Cap score_max_retries to 100 to prevent unreasonable values - Add skip messages to scoresql tests for CI visibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…eplacement Replace blind `scoring_sql.replace(file_stem, INPUT_TABLE_NAME)` with a regex that only substitutes `file_stem` after FROM/JOIN keywords, preventing corruption of column names or literals that contain the file stem as a substring. Also warn when --score-max-retries is silently clamped to 100. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… in scoresql - Move Regex::new() before the retry loop since file_stem is invariant - Extend pattern to match INTO/UPDATE keywords (not just FROM/JOIN) - Handle quoted/backtick-delimited table names in the replacement regex - Add safety comment about INPUT_TABLE_NAME and regex replacement chars Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds scoresql-based validation/refinement to describegpt’s SQL-RAG execution path, with new CLI flags and integration tests to exercise the behavior.

Changes:

Introduces --no-score-sql, --score-threshold, and --score-max-retries options and wires them into the --prompt + --sql-results SQL execution flow.
Adds a scoring loop that runs qsv scoresql --json and optionally re-prompts the LLM to iteratively improve low-scoring SQL.
Adds new integration tests covering default scoring, disabling scoring, threshold/retry behavior, and DuckDB-backed scoring.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File	Description
`src/cmd/describegpt.rs`	Adds CLI flags plus SQL scoring/refinement logic using the `scoresql` subcommand before SQL execution.
`tests/test_describegpt.rs`	Adds integration tests for the new scoresql/scoring flags and retry/threshold behavior (including DuckDB cases).

- Add success assertions to all scoresql integration tests to catch early failures before checking stderr - Change threshold from 100 to 101 in high-threshold tests to eliminate flakiness (a perfect 100/100 score is possible) - Fix misleading "Attempt" wording to "Retry" in refinement prompt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds scoresql validation/iteration to describegpt so LLM-generated SQL can be scored and optionally refined before execution when --prompt is used with --sql-results.

Changes:

Introduces new CLI flags to control SQL scoring (--no-score-sql, --score-threshold, --score-max-retries) and wires them into the SQL-execution path.
Implements a scoring loop that calls the scoresql subcommand, logs score/attempts, and optionally re-prompts the LLM to improve low-scoring SQL.
Adds integration tests covering default scoring behavior, disabling scoring, thresholds, retry limits, and DuckDB scoring scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`src/cmd/describegpt.rs`	Adds CLI options + implements `scoresql`-based scoring/refinement before executing generated SQL.
`tests/test_describegpt.rs`	Adds integration tests validating the new scoring flags and retry/threshold behavior (including DuckDB cases).

jqnatividad and others added 4 commits March 17, 2026 13:50

jqnatividad requested a review from Copilot March 17, 2026 18:47

Copilot started reviewing on behalf of jqnatividad March 17, 2026 18:47 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

jqnatividad requested a review from Copilot March 17, 2026 19:09

Copilot started reviewing on behalf of jqnatividad March 17, 2026 19:10 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

Comment thread tests/test_describegpt.rs

Comment thread tests/test_describegpt.rs

Comment thread src/cmd/describegpt.rs

jqnatividad merged commit e7ed7f0 into master Mar 17, 2026
20 of 21 checks passed

jqnatividad deleted the describegpt-scoresql-integration branch March 17, 2026 19:22

BrewTestBot mentioned this pull request Mar 23, 2026

qsv 18.0.0 Homebrew/homebrew-core#273698

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(describegpt): scoresql integration#3624