Skip to content

describegpt: split run_inference_options into per-phase helpers#3722

Merged
jqnatividad merged 5 commits into
masterfrom
refactor-describegpt-run-inference-options
Apr 20, 2026
Merged

describegpt: split run_inference_options into per-phase helpers#3722
jqnatividad merged 5 commits into
masterfrom
refactor-describegpt-run-inference-options

Conversation

@jqnatividad

Copy link
Copy Markdown
Collaborator

Summary

Continuation of the describegpt decomposition work (follows merged #3720 / #3721). The ~875-line run_inference_options is now a thin orchestrator over eight focused helpers:

  • build_inference_messages (was a nested get_messages closure) — promoted to file-scope free fn.
  • run_dictionary_phase, run_description_phase, run_tags_phase — one per inference kind.
  • run_prompt_phase — handles session loading, relevance-check, sliding-window, LLM call, and session-state mutation. Returns a PromptPhaseOutcome struct carrying the completion, has_sql_query flag, mutated session state, and the system prompt that the SQL phase reuses.
  • execute_sql_query_phase — extracts the ~400-line SQL-results block (file validation, fence extraction, scoresql retry loop, DuckDB/Polars dispatch, session tracking).
  • finalize_structured_output — JSON/TOON emission at end of run.

run_inference_options is now: create client, check_model, dispatch phases in order, apply max-tokens gate against the last phase's tokens, route SQL, finalize output, persist session.

Minor consolidations

  • get_prompt_file(args)?.base_url is now computed once at the orchestrator instead of 4 times (once per phase). One visible behavioral delta: a malformed prompt-file now errors before the first LLM call rather than after the Dictionary phase — strictly better UX (no tokens wasted on failure).
  • get_output_format(args)? is computed once and passed to each phase instead of re-derived per phase.

Test plan

  • cargo build --locked --bin qsv -F all_features
  • cargo clippy -F all_features --bin qsv
  • cargo t describegpt -F all_features — 64 integration tests pass
  • cargo test --locked -F all_features --bin qsv describegpt::tests — 13 unit tests pass
  • Manual smoke: run describegpt --all --prompt "..." --sql-results out.csv against a sample CSV to exercise Dictionary → Description → Tags → Prompt → SQL pipeline end-to-end.

Deferred (from the handoff plan)

Still open: run decomposition (~700 lines), cache.rs submodule extraction (gated on #[io_cached] proc-macro requalification), Tier 3 polish bundle (triplicated dictionary formatters, SQL-fence regex hoist, named constants, .clone() trims, parser unit tests), try_fix_json edge cases.

🤖 Generated with Claude Code

…pers

Decompose the ~875-line run_inference_options into eight focused helpers
while preserving all behavior:

- build_inference_messages (was nested get_messages closure) promoted
  to file-scope free fn.
- run_dictionary_phase
- run_description_phase
- run_tags_phase
- run_prompt_phase (returns a PromptPhaseOutcome struct carrying the
  completion, has_sql_query flag, mutated session state, and the
  system_prompt that execute_sql_query_phase reuses for refinement calls)
- execute_sql_query_phase — extracts the ~400-line SQL-results block
  (file validation, fence extraction, scoresql retry loop, DuckDB/Polars
  dispatch, session tracking)
- finalize_structured_output — JSON/TOON emission at end of run

run_inference_options is now a thin orchestrator: create client, check
model, dispatch phases in order, apply max-tokens gate, route SQL,
finalize output, persist session.

Minor consolidation along the way: get_prompt_file(args)? is called
once at the orchestrator (for base_url) instead of 4 times across the
phases; get_output_format(args)? is called once and passed through
instead of re-computed per phase. Error-ordering is preserved within
the phases; the only visible timing change is that a bad prompt-file
now errors before the first LLM call rather than after the dictionary
phase — arguably better UX (no tokens wasted on failure).

64 integration + 13 unit tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codacy-production

codacy-production Bot commented Apr 20, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

jqnatividad and others added 2 commits April 20, 2026 11:01
Three findings from the review of 514cacc:

1. **Medium — max-tokens regression for dictionary-only runs.**
   The original code left `completion_response` as `CompletionResponse::default()`
   when only the dictionary ran, so `--dictionary --max-tokens=N` with dictionary
   tokens ≥ N never tripped the gate. My `last_completion = data_dict.clone()`
   after the dictionary phase broke that invariant. Dropped the assignment and
   added a comment documenting why — the description/tags/prompt branches
   still overwrite `last_completion` correctly, and dictionary-only runs
   pass through as before.

2. **Low — 4 redundant String allocations per error path.**
   `execute_sql_query_phase` was calling
   `normalized_session_path.map(String::from).as_ref()` five times in error
   branches, each allocating a fresh String just to get `Option<&String>`.
   Bind once at the top of the function and pass `session_path_owned.as_ref()`
   everywhere.

3. **Low — session_len sentinel documentation.**
   `flag_session_len == 0` collapses "user did not set it" and "user set 0"
   into the same default (10). Preserved from pre-refactor behavior but
   worth a comment; added one in `run_prompt_phase`.

13 unit + 64 integration tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors describegpt’s large run_inference_options function into a thin orchestrator that delegates to focused per-phase helpers, while consolidating repeated prompt-file/output-format lookups.

Changes:

  • Promotes message-construction into build_inference_messages and splits inference into per-phase helpers (run_dictionary_phase, run_description_phase, run_tags_phase, run_prompt_phase).
  • Extracts the SQL execution/refinement path into execute_sql_query_phase and adds PromptPhaseOutcome to carry prompt-phase results forward.
  • Centralizes structured output emission via finalize_structured_output and computes base_url / output_format once at the orchestrator.

Comment thread src/cmd/describegpt.rs Outdated
Comment thread src/cmd/describegpt.rs Outdated
Comment thread src/cmd/describegpt.rs Outdated
Comment thread src/cmd/describegpt.rs
Comment thread src/cmd/describegpt.rs
Five findings, four of them variants of the same issue:

Four cache-invalidation sites in execute_sql_query_phase skipped
invalidation when cache_type == Fresh. But CacheType::Fresh still
writes the completion into the underlying Disk/Redis cache (see the
second get_*_completion call inside get_cached_completion at ~L2102,
comment on CacheType enum confirms: "Forces fresh API call but still
updates cache"). So a --fresh run that produced unusable SQL (failed
fence extraction, DuckDB execution error, Polars error) left the bad
Prompt response in the cache, to be reused by the next non-fresh run.

Changed the guard at all five sites from
    cache_type != &Fresh && cache_type != &None
to
    cache_type != &None
so invalidation fires whenever caching is active. This is a
pre-existing bug that became more visible after the phase split —
bundled here since it's a one-line correctness fix at call sites
the reviewer pointed to. None of this PR's structural changes affect
when or whether these invalidations fire.

Also taking a perf micro-fix from the review in build_inference_messages:
replaced `json!(messages)` (which re-serializes through the json! macro
even though messages is already Vec<serde_json::Value>) with
serde_json::Value::Array(messages), which is a direct constructor.

13 unit + 64 integration tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

Comment thread src/cmd/describegpt.rs Outdated
Comment thread src/cmd/describegpt.rs Outdated
Comment thread src/cmd/describegpt.rs Outdated
Three new findings from the latest Copilot review:

1. **dictionary_cache_type respects --no-cache.** In run_dictionary_phase,
   `--prompt` + `--fresh` was forcing Disk/Redis for the dictionary
   phase even when cache_type is None (user passed --no-cache). That
   silently re-enabled caching. Now gated on
   `cache_type != &CacheType::None` so --no-cache remains authoritative.
   Pre-existing behavior.

2. **Polars .csv sibling writability probe.** execute_sql_query_phase
   validated only the primary sql_results path, but the Polars path
   renames to `{sql_results}.csv` post-run. A pre-existing read-only
   `.csv` sibling would silently break the rename. Loop over
   `[sql_results_path, sql_results_path.with_extension("csv")]` and
   fail early if either exists read-only. DuckDB writes to the primary
   path so the .csv check is no-op there.

3. **TSV-aware "Output written to ..." status message.** Per-phase TSV
   helpers derive `{filestem}.{kind}.tsv` paths via get_tsv_output_path,
   so "Output written to {output}" is misleading for --format tsv.
   Surface the derivation pattern in the status message instead.

13 unit + 64 integration tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

@jqnatividad jqnatividad merged commit 13a6c5e into master Apr 20, 2026
21 checks passed
@jqnatividad jqnatividad deleted the refactor-describegpt-run-inference-options branch April 20, 2026 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants