You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct. ↩
Overall, 12 test cases ran with 10 passing and 2 failing, so the run is not release-clean despite broad stability across streaming, batching, timeout, result-shape, and callback backpressure scenarios. The most important finding is a high-severity regression introduced by this PR in server/doltgres_handler.go where non-EOF iterator errors are returned incorrectly and can be silently dropped in both simple (ComQuery) and prepared (ComExecuteBound) paths, allowing partial/truncated result streams to appear successful to clients.
❌ Failed (2)
Category
Summary
Screenshot
Iterator
⚠️ Non-EOF iterator errors are swallowed, so partial rows can be returned with success status.
N/A
Iterator
⚠️ Prepared execution path inherits the same swallowed iterator-error behavior.
⚠️ Non-EOF iterator errors are silently dropped
What failed: A non-EOF iterator failure does not propagate to the client; the stream can terminate as if successful after partial rows, but expected behavior is a client-visible query error.
Impact: Query clients can receive truncated data without an error, which breaks correctness guarantees for streamed reads. This can cause downstream systems to trust incomplete results as successful query output.
Steps to reproduce:
Start Doltgres and connect as superuser on port 5432.
Run a fixture query that triggers a non-EOF iter.Next error after partial progress.
Observe returned client status and error propagation.
Stub / mock context: Authentication bootstrap was bypassed so local sessions could run deterministically as the postgres role without full auth handshake; no iterator-result mocking or route interception was used for this defect verification.
Code analysis: I inspected the iterator read loop and downstream default-iterator call path in server/doltgres_handler.go; the non-EOF branch returns the named err return variable instead of iErr, and doQuery treats a nil return from resultForDefaultIter as success.
Why this is likely a bug: The production code drops the actual iterator failure object on the error branch, so a real stream fault can be misreported as success.
What failed: Prepared execution inherits the same error-propagation defect, so iterator failures can be swallowed instead of surfaced as a query error.
Impact: Prepared-query consumers can also receive incomplete streamed data without a failure signal. This undermines reliability for applications that rely on prepared execution for core read paths.
Steps to reproduce:
Connect as superuser and prepare a large SELECT statement.
Execute the prepared statement and begin reading streamed rows.
Trigger a non-EOF iterator failure condition mid-stream and inspect the client-visible result.
Stub / mock context: Authentication bootstrap was bypassed so local sessions could run deterministically as the postgres role without full auth handshake; prepared-path confirmation in this run relied on direct code-path verification rather than a full end-to-end prepared fixture replay.
Code analysis: I traced ComExecuteBound into doQuery, which routes prepared queries through resultForDefaultIter; the same non-EOF branch returns err instead of iErr, so prepared streams share the same silent-failure defect.
Why this is likely a bug: The prepared path reaches the same incorrect non-EOF error branch, so a production prepared stream can mask real iterator failures.
Re-run confirmed no server-wide deadlock: three throttled large-result clients stayed active while an independent query completed quickly.
N/A
Callback
Under sustained slow-consumer pressure, the pipeline remained responsive and completed health checks without persistent hang.
N/A
Callback
A streaming-path error was surfaced, and a follow-up command succeeded in the same session, indicating connection recovery.
N/A
Iterator
Full ordered SELECT streamed all 300 rows and completed cleanly at EOF.
N/A
Pipeline
Previous blocked state was infrastructure-only (server startup/port reachability). After recovering runtime in nested container and connecting via container IP, a 400-row (>128 batch) ordered SELECT streamed to completion with exact count and strict 1..400 ordering, confirming no dropped/duplicated rows across batch boundaries.
N/A
Pipeline
Recovered the environment (installed missing ICU deps, built doltgres binary, restarted server) and validated pgwire ordering using TestWireTypesSending with timezone alignment; no RowDescription/DataRow ordering regression was observed across simple and execute paths.
N/A
Pipeline
Not a real app bug. After environment recovery in nested container, exact internal boundary (128 rows) and boundary+1 (129 rows) both streamed completely and in order to the client, showing no tail-row loss at boundary transitions.
N/A
Result
Original blockage was infrastructure-only (no running nested container). After provisioning the inner devcontainer and starting Doltgres, a 300-row ordered SELECT completed successfully, confirming stable row-result streaming across multiple internal batches without switching to OK-result semantics.
Timeout
Re-execution confirmed the earlier block was environmental and the configured timeout path behaved as expected.
Timeout
Repeated timeout-edge runs were consistent (6/6 timeout for delayed queries, 6/6 success for controls) with no flakiness.
History reset (rebase or force-push detected). Starting test narrative over.
13 test cases ran. 1 additional finding, 12 passed.
Overall, 12 of 13 test cases passed, confirming stable behavior across spooler streaming, batch delivery (including tail batches), DML affected-row counts, timeout/disconnect cleanup, and protocol framing, with follow-up queries consistently showing the server remained responsive after interrupts or transport faults. The key finding was one high-severity confirmed defect: under slow-client backpressure, cancellation can stall because the callback path blocks in socket flush without a cancel-aware boundary, preventing prompt terminal cancellation and potentially tying up server resources.
✅ Passed (12)
Category
Summary
Screenshot
Batch
257-row non-multiple batch returned all rows exactly once with integrity checks passing.
Throttled 129-row stream delivered all unique rows with no drops or duplicates.
Delivery
Protocol harness observed exactly one RowDescription, 20 DataRows, and CommandComplete SELECT 20 for simple query mode.
Delivery
Extended-protocol execute produced 15 DataRows with zero RowDescription re-send and CommandComplete SELECT 15, confirming no duplicate metadata framing.
Delivery
Harness interrupted an active stream by closing the client transport; subsequent reconnect and SELECT 1 succeeded, confirming fail-fast behavior and recovery for new sessions.
Delivery
After forcing a transport close during initial row framing, new session queries still succeeded (SELECT 1), indicating deterministic connection failure handling without follow-up protocol desync.
Spool
Started doltgres in nested container with DOLTGRES_AUTH_BOOTSTRAP_BYPASS=1, created 300-row table, SELECT returned (300 rows), and follow-up SELECT 1 succeeded without hang/crash.
Spool
Started a large streaming generate_series query in session A, canceled it in-flight from the client (SIGINT) with log output 'Cancel request sent', then validated server health from a new session using SELECT 1.
Spool
Confirmed non-bug. Source code explicitly guards mixed streams by panicking at server/doltgres_handler.go:713, and panic handling wraps that into an error via pan2err at lines 625-633. Re-executed using a controlled in-package harness in the nested container (go test ./server -run TestRemediation_MixedOkAndRowResultIsContained) and verified the mixed stream fails in a controlled way while a subsequent normal stream succeeds, demonstrating containment and continuity.
Timeout
A streaming query was started and the client process was forcefully disconnected mid-flight; pg_stat_activity showed zero lingering active stream entries and a fresh query immediately succeeded, confirming orderly cancellation and continued responsiveness.
Timeout
A long-running stream (including a preliminary OK statement and slow row emission) was aborted mid-stream by killing the client; an immediate subsequent query succeeded and pg_stat_activity confirmed no stale active query entry remained.
ℹ️ Additional Findings (1)
These findings are unrelated to the current changes but were observed during testing.
Category
Summary
Screenshot
Delivery
⚠️ Real bug confirmed: under backpressure, cancellation cannot promptly reach a terminal response because callback delivery blocks in socket flush without a cancel-aware deadline/context check.
⚠️ Backpressure cancellation stalls during socket flush
What failed: Cancellation is expected to unwind promptly and return a terminal response, but the request can stall because the callback path blocks in network flush and does not observe cancellation until flush returns.
Impact: Slow or stalled clients can hold query execution in a non-terminal state, preventing prompt cancellation and tying up server resources. This degrades reliability for core query execution under adverse network conditions with no practical user-side workaround.
Steps to reproduce:
Start a large row-returning query and intentionally read very slowly from the client socket.
Sustain pressure until callback delivery is actively flushing rows.
Cancel the request while the server is still attempting to write buffered DataRow messages.
Stub / mock context: Authentication and authorization were intentionally bypassed for this run so protocol framing behavior could be validated deterministically without real credential checks.
Code analysis: I inspected the spooler pipeline and wire send path in server/doltgres_handler.go and server/connection_handler.go. The callback worker checks ctx.Done() before reading resChan, but once it calls callback(ctx, r), message writes run through send() -> Flush() with no cancel-aware timeout/deadline, so cancellation cannot preempt an in-flight flush.
Why this is likely a bug: The production callback/send code has a cancellation blind spot during blocking flush, so the observed non-terminal cancel behavior is consistent with a real code-path defect rather than a test harness artifact.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR splits the convert to wire format and send to client steps into two threads.
Copies optimization from gms: dolthub/go-mysql-server#3310