Skip to content

[fix](load) fix missing error url return for stream load#54115

Merged
dataroaring merged 1 commit intoapache:masterfrom
liaoxin01:fix_error_url
Aug 4, 2025
Merged

[fix](load) fix missing error url return for stream load#54115
dataroaring merged 1 commit intoapache:masterfrom
liaoxin01:fix_error_url

Conversation

@liaoxin01
Copy link
Contributor

@liaoxin01 liaoxin01 commented Jul 30, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:
when pipe is cancelled, stream load execution may return early without error url.
(test_stream_load_with_filtered_rows.groovy:64) - Stream load result: {
"TxnId": 11,
"Label": "2bbde37b-0589-4cce-8497-58e25af46590",
"Comment": "",
"TwoPhaseCommit": "false",
"Status": "Fail",
"Message": "[CANCELLED]cancelled: [DATA_QUALITY_ERROR]Encountered unqualified data, stop processing. Please check if the source data matches the schema, and consider disabling strict mode or increasing max_filter_ratio.. cur path: ",
"NumberTotalRows": 32512,
"NumberLoadedRows": 32460,
"NumberFilteredRows": 52,
"NumberUnselectedRows": 0,
"LoadBytes": 29494781,
"LoadTimeMs": 2138,
"BeginTxnTimeMs": 1,
"StreamLoadPutTimeMs": 7,
"ReadDataTimeMs": 24,
"WriteDataTimeMs": 0,
"ReceiveDataTimeMs": 1836,
"CommitAndPublishTimeMs": 0
}

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@liaoxin01
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33931 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f92eac0bb19e29256f4b5e341b3c606df0de9f1f, data reload: false

------ Round 1 ----------------------------------
q1	17667	5665	5699	5665
q2	1940	293	191	191
q3	10261	1327	692	692
q4	10223	999	502	502
q5	7530	2394	2214	2214
q6	171	162	129	129
q7	896	755	592	592
q8	9303	1311	978	978
q9	6748	5073	5053	5053
q10	6879	2342	1972	1972
q11	453	268	264	264
q12	337	361	218	218
q13	17806	3588	3053	3053
q14	237	253	225	225
q15	517	482	457	457
q16	431	433	385	385
q17	564	823	374	374
q18	7441	7130	6940	6940
q19	2277	979	541	541
q20	325	308	213	213
q21	3342	3011	2285	2285
q22	1016	1037	988	988
Total cold run time: 106364 ms
Total hot run time: 33931 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5960	5798	5758	5758
q2	238	302	221	221
q3	2093	2539	2154	2154
q4	1288	1720	1324	1324
q5	4265	4427	4414	4414
q6	218	187	135	135
q7	1828	1918	1986	1918
q8	2605	2553	2371	2371
q9	7216	7279	7394	7279
q10	3148	3213	2932	2932
q11	581	594	484	484
q12	677	756	640	640
q13	3347	3675	3289	3289
q14	308	329	323	323
q15	509	459	455	455
q16	438	502	431	431
q17	1171	1480	1394	1394
q18	8186	7897	7834	7834
q19	11108	910	1001	910
q20	3809	1954	1757	1757
q21	14637	4163	4233	4163
q22	1119	1007	968	968
Total cold run time: 74749 ms
Total hot run time: 51154 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 161866 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f92eac0bb19e29256f4b5e341b3c606df0de9f1f, data reload: false

reason	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 22:43:56	2023-12-26 22:44:01	NULL	utf-8	NULL	NULL	
============================================
query1	1019	379	412	379
query2	6539	1696	1674	1674
query3	6736	218	223	218
query4	27114	23811	22784	22784
query5	4338	612	507	507
query6	310	248	228	228
query7	4622	517	295	295
query8	281	236	225	225
query9	8607	3114	3103	3103
query10	491	344	288	288
query11	15777	14981	14809	14809
query12	182	132	136	132
query13	1664	544	414	414
query14	9940	7625	7737	7625
query15	217	200	174	174
query16	7127	631	448	448
query17	976	782	639	639
query18	2003	438	326	326
query19	226	215	186	186
query20	157	144	154	144
query21	220	128	108	108
query22	3941	4175	3851	3851
query23	34441	33953	34028	33953
query24	7099	2383	2417	2383
query25	615	514	435	435
query26	723	289	160	160
query27	2164	479	359	359
query28	2857	2261	2259	2259
query29	670	615	475	475
query30	291	232	197	197
query31	860	790	716	716
query32	86	78	74	74
query33	534	413	360	360
query34	823	830	507	507
query35	799	826	777	777
query36	1019	1022	962	962
query37	137	108	93	93
query38	4024	4003	3868	3868
query39	1432	1379	1373	1373
query40	235	147	133	133
query41	61	57	53	53
query42	144	123	127	123
query43	501	502	466	466
query44	1415	863	859	859
query45	191	183	186	183
query46	1044	1052	675	675
query47	1813	1849	1731	1731
query48	389	440	309	309
query49	674	520	418	418
query50	654	680	400	400
query51	4134	4287	4125	4125
query52	128	126	117	117
query53	254	285	217	217
query54	641	658	551	551
query55	93	87	93	87
query56	346	354	350	350
query57	1169	1222	1159	1159
query58	338	324	325	324
query59	2616	2586	2552	2552
query60	398	377	410	377
query61	124	121	120	120
query62	788	748	657	657
query63	245	222	214	214
query64	2634	1087	842	842
query65	4237	4103	4118	4103
query66	1027	449	328	328
query67	query68	17130	867	853	853
query69	1172	296	279	279
query70	1360	1108	1178	1108
query71	720	316	321	316
query72	9261	2242	2115	2115
query73	3607	684	349	349
query74	9183	9060	8980	8980
query75	7571	3145	2641	2641
query76	8753	1209	836	836
query77	1159	409	337	337
query78	query79	18204	608	583	583
query80	1742	636	478	478
query81	539	279	246	246
query82	530	151	115	115
query83	322	277	271	271
query84	302	103	83	83
query85	1651	390	347	347
query86	350	336	300	300
query87	4311	4236	4110	4110
query88	5510	2210	2206	2206
query89	521	381	312	312
query90	2555	240	235	235
query91	141	135	109	109
query92	96	74	65	65
query93	6739	998	657	657
query94	1126	377	282	282
query95	423	328	321	321
query96	495	579	278	278
query97	2654	2751	2583	2583
query98	260	238	230	230
query99	1476	1420	1260	1260
Total cold run time: 297217 ms
Total hot run time: 161866 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.48 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f92eac0bb19e29256f4b5e341b3c606df0de9f1f, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.04
query3	0.24	0.08	0.07
query4	1.60	0.10	0.11
query5	0.45	0.40	0.41
query6	1.14	0.66	0.67
query7	0.03	0.02	0.02
query8	0.05	0.04	0.03
query9	0.55	0.49	0.46
query10	0.53	0.52	0.52
query11	0.16	0.10	0.10
query12	0.14	0.10	0.10
query13	0.65	0.63	0.62
query14	0.87	1.08	1.06
query15	0.97	0.88	0.92
query16	0.39	0.39	0.39
query17	1.06	1.02	1.06
query18	0.22	0.20	0.20
query19	1.95	1.79	1.91
query20	0.02	0.01	0.02
query21	15.39	0.85	0.57
query22	0.80	1.25	0.69
query23	14.80	1.12	0.64
query24	7.22	1.18	0.41
query25	0.48	0.20	0.09
query26	0.65	0.15	0.13
query27	0.09	0.05	0.05
query28	9.11	0.84	0.44
query29	12.64	3.84	3.29
query30	3.05	2.95	2.94
query31	2.81	0.56	0.39
query32	3.25	0.56	0.49
query33	3.02	3.14	3.15
query34	16.13	5.28	4.91
query35	4.86	4.95	4.95
query36	0.70	0.51	0.50
query37	0.10	0.07	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.04
query40	0.17	0.13	0.14
query41	0.08	0.04	0.03
query42	0.03	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 106.65 s
Total hot run time: 32.48 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.79% (16076/27816)
Line Coverage 46.53% (144606/310794)
Region Coverage 35.79% (108874/304166)
Branch Coverage 38.43% (48050/125023)

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 81.20% (22171/27305)
Line Coverage 73.91% (229423/310421)
Region Coverage 61.67% (192334/311883)
Branch Coverage 65.37% (82622/126395)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 4, 2025
@dataroaring dataroaring merged commit 3af7aab into apache:master Aug 4, 2025
32 of 34 checks passed
github-actions bot pushed a commit that referenced this pull request Aug 4, 2025
when pipe is cancelled, stream load execution may return early without
error url.
(test_stream_load_with_filtered_rows.groovy:64) - Stream load result: {
    "TxnId": 11,
    "Label": "2bbde37b-0589-4cce-8497-58e25af46590",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Fail",
"Message": "[CANCELLED]cancelled: [DATA_QUALITY_ERROR]Encountered
unqualified data, stop processing. Please check if the source data
matches the schema, and consider disabling strict mode or increasing
max_filter_ratio.. cur path: ",
    "NumberTotalRows": 32512,
    "NumberLoadedRows": 32460,
    "NumberFilteredRows": 52,
    "NumberUnselectedRows": 0,
    "LoadBytes": 29494781,
    "LoadTimeMs": 2138,
    "BeginTxnTimeMs": 1,
    "StreamLoadPutTimeMs": 7,
    "ReadDataTimeMs": 24,
    "WriteDataTimeMs": 0,
    "ReceiveDataTimeMs": 1836,
    "CommitAndPublishTimeMs": 0
}
github-actions bot pushed a commit that referenced this pull request Aug 4, 2025
when pipe is cancelled, stream load execution may return early without
error url.
(test_stream_load_with_filtered_rows.groovy:64) - Stream load result: {
    "TxnId": 11,
    "Label": "2bbde37b-0589-4cce-8497-58e25af46590",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Fail",
"Message": "[CANCELLED]cancelled: [DATA_QUALITY_ERROR]Encountered
unqualified data, stop processing. Please check if the source data
matches the schema, and consider disabling strict mode or increasing
max_filter_ratio.. cur path: ",
    "NumberTotalRows": 32512,
    "NumberLoadedRows": 32460,
    "NumberFilteredRows": 52,
    "NumberUnselectedRows": 0,
    "LoadBytes": 29494781,
    "LoadTimeMs": 2138,
    "BeginTxnTimeMs": 1,
    "StreamLoadPutTimeMs": 7,
    "ReadDataTimeMs": 24,
    "WriteDataTimeMs": 0,
    "ReceiveDataTimeMs": 1836,
    "CommitAndPublishTimeMs": 0
}
morrySnow pushed a commit that referenced this pull request Aug 4, 2025
…54115 (#54267)

Cherry-picked from #54115

Co-authored-by: Xin Liao <liaoxin@selectdb.com>
dataroaring pushed a commit that referenced this pull request Aug 12, 2025
…54115 (#54266)

Cherry-picked from #54115

Co-authored-by: Xin Liao <liaoxin@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.8-merged dev/3.1.0-merged p0_easy reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments