Skip to content

[feature](agg) Add a knob to shuffled streaming agg#56956

Merged
Gabriel39 merged 2 commits intoapache:masterfrom
Gabriel39:dev_1014
Oct 15, 2025
Merged

[feature](agg) Add a knob to shuffled streaming agg#56956
Gabriel39 merged 2 commits intoapache:masterfrom
Gabriel39:dev_1014

Conversation

@Gabriel39
Copy link
Contributor

@Gabriel39 Gabriel39 commented Oct 14, 2025

What problem does this PR solve?

Add a knob to control whether data fed into a streaming aggregation should be shuffled at first.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Oct 14, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Gabriel39
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.26% (1636/1965)
Line Coverage 67.75% (28902/42660)
Region Coverage 68.04% (14247/20939)
Branch Coverage 58.31% (7581/13002)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 20.00% (4/20) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-DS: Total hot run time: 190166 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2da6e43225d021ca0996dc0ab02efc833cf78080, data reload: false

query1	1062	433	408	408
query2	6549	1703	1687	1687
query3	6750	221	218	218
query4	26149	23455	23738	23455
query5	4462	680	476	476
query6	324	241	221	221
query7	4651	507	304	304
query8	305	259	245	245
query9	8719	2570	2556	2556
query10	515	337	289	289
query11	15718	15149	14767	14767
query12	181	126	116	116
query13	1691	565	445	445
query14	11117	9226	9154	9154
query15	216	192	173	173
query16	7762	689	572	572
query17	1355	829	692	692
query18	2071	450	352	352
query19	241	223	203	203
query20	153	138	171	138
query21	220	144	129	129
query22	4675	4602	4837	4602
query23	35091	33736	33987	33736
query24	8690	2585	2540	2540
query25	677	588	517	517
query26	1255	286	181	181
query27	2957	538	373	373
query28	4429	2239	2209	2209
query29	858	683	497	497
query30	381	257	219	219
query31	955	866	752	752
query32	90	82	76	76
query33	604	389	348	348
query34	834	881	543	543
query35	826	902	779	779
query36	1046	1004	919	919
query37	114	106	82	82
query38	3565	3564	3522	3522
query39	1475	1410	1408	1408
query40	226	127	116	116
query41	61	61	60	60
query42	126	119	116	116
query43	485	481	473	473
query44	1355	838	831	831
query45	186	174	173	173
query46	833	989	631	631
query47	1766	1806	1720	1720
query48	385	411	316	316
query49	761	493	408	408
query50	632	682	417	417
query51	3872	3903	3875	3875
query52	109	107	98	98
query53	240	259	197	197
query54	616	608	542	542
query55	88	88	79	79
query56	304	344	296	296
query57	1190	1207	1138	1138
query58	290	281	295	281
query59	2590	2629	2523	2523
query60	362	339	340	339
query61	156	156	158	156
query62	812	720	657	657
query63	239	202	193	193
query64	4596	1156	827	827
query65	4066	3963	3980	3963
query66	1109	424	324	324
query67	15513	15590	15126	15126
query68	8858	946	599	599
query69	477	317	285	285
query70	1383	1306	1281	1281
query71	501	338	311	311
query72	5764	4936	4882	4882
query73	715	585	352	352
query74	9270	9120	8781	8781
query75	4258	3336	2879	2879
query76	3744	1157	722	722
query77	814	406	307	307
query78	9571	9809	8920	8920
query79	1900	803	590	590
query80	648	579	507	507
query81	481	261	226	226
query82	396	160	133	133
query83	287	259	243	243
query84	248	112	102	102
query85	894	475	420	420
query86	341	314	326	314
query87	3764	4057	3598	3598
query88	3028	2215	2214	2214
query89	386	333	302	302
query90	2008	213	211	211
query91	174	164	137	137
query92	88	70	65	65
query93	1176	989	646	646
query94	685	454	330	330
query95	405	325	325	325
query96	492	591	275	275
query97	2966	2991	2890	2890
query98	229	219	216	216
query99	1467	1387	1325	1325
Total cold run time: 279119 ms
Total hot run time: 190166 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.16 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2da6e43225d021ca0996dc0ab02efc833cf78080, data reload: false

query1	0.06	0.05	0.04
query2	0.10	0.06	0.05
query3	0.25	0.08	0.08
query4	1.60	0.12	0.13
query5	0.27	0.27	0.25
query6	1.21	0.65	0.64
query7	0.04	0.03	0.03
query8	0.06	0.05	0.06
query9	0.63	0.53	0.52
query10	0.58	0.62	0.58
query11	0.16	0.11	0.11
query12	0.16	0.12	0.12
query13	0.63	0.62	0.61
query14	1.03	1.04	1.04
query15	0.86	0.85	0.85
query16	0.40	0.40	0.38
query17	1.03	1.04	1.06
query18	0.22	0.20	0.20
query19	1.97	1.87	1.86
query20	0.02	0.02	0.01
query21	15.43	0.92	0.59
query22	0.76	1.14	0.66
query23	15.04	1.40	0.62
query24	7.19	0.75	0.38
query25	0.47	0.11	0.10
query26	0.64	0.17	0.15
query27	0.07	0.06	0.06
query28	9.52	1.39	0.92
query29	12.56	3.92	3.23
query30	0.28	0.13	0.11
query31	2.82	0.58	0.38
query32	3.23	0.55	0.47
query33	3.07	3.22	3.21
query34	16.18	5.51	4.92
query35	4.97	4.91	4.91
query36	0.72	0.51	0.50
query37	0.10	0.07	0.08
query38	0.08	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.14	0.15
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 104.79 s
Total hot run time: 30.16 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 11.11% (9/81) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.53% (17753/33793)
Line Coverage 37.74% (161445/427779)
Region Coverage 32.21% (123221/382607)
Branch Coverage 33.62% (54111/160945)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 86.42% (70/81) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.10% (23553/33128)
Line Coverage 57.50% (245825/427525)
Region Coverage 52.53% (203599/387557)
Branch Coverage 54.41% (88048/161834)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 20.00% (4/20) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 86.42% (70/81) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.09% (23550/33128)
Line Coverage 57.49% (245769/427525)
Region Coverage 52.55% (203649/387557)
Branch Coverage 54.39% (88019/161834)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 20.00% (4/20) 🎉
Increment coverage report
Complete coverage report

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Oct 15, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@Gabriel39 Gabriel39 merged commit 9bdc969 into apache:master Oct 15, 2025
26 of 28 checks passed
Gabriel39 added a commit to Gabriel39/incubator-doris that referenced this pull request Feb 6, 2026
Add a knob to control whether data fed into a streaming aggregation should be shuffled at first.
@Gabriel39 Gabriel39 mentioned this pull request Feb 6, 2026
16 tasks
yiguolei pushed a commit that referenced this pull request Feb 7, 2026
### What problem does this PR solve?

Pick #60253 #60393 #60481 #56956 #60334 #60494

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.4-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

Comments