Skip to content

[fix](nereids) stop merge project when generating huge expression#55293

Merged
924060929 merged 4 commits intoapache:masterfrom
yujun777:fix-merge-project-with-huge-expression
Sep 1, 2025
Merged

[fix](nereids) stop merge project when generating huge expression#55293
924060929 merged 4 commits intoapache:masterfrom
yujun777:fix-merge-project-with-huge-expression

Conversation

@yujun777
Copy link
Contributor

@yujun777 yujun777 commented Aug 26, 2025

What problem does this PR solve?

when merge projects, it will replace the parent project's slot reference with the child project's origin expression.

for example:

LogicalProject ( k + k * 2 + k * 3)
|
LogicalProject ( a + b as k)

after replace k with a + b, we will got the merged project:

LogicalProject( (a + b) + (a + b) * 2 + (a + b) * 3)

suppose the origin parent project contains n k , then the merged project will contains 3n (a + b contains 3 expression) related expressions.
then if there are a continous project chain, after merge these projects, the final merge expression's size may grow exponentially, finally may exceed expression size limit, then throw exception Exceeded the maximum children of an expression tree.

so when merging projects, if generate a big expression which its size exceeds the limit, need to stop merging projects.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Aug 26, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777 yujun777 force-pushed the fix-merge-project-with-huge-expression branch from ae3c17f to 331b2f5 Compare August 27, 2025 07:02
@yujun777 yujun777 changed the title [draft](nereids) fix merge project with huge expression [fix](nereids) fix merge project generate huge expression Aug 27, 2025
@yujun777 yujun777 marked this pull request as ready for review August 27, 2025 10:18
@yujun777
Copy link
Contributor Author

run buildall

1 similar comment
@yujun777
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34197 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 774ecbd9f5054e87d8b85079cbd0265f9f29ee75, data reload: false

------ Round 1 ----------------------------------
q1	17612	5258	5127	5127
q2	1973	352	209	209
q3	10233	1266	733	733
q4	10226	1029	517	517
q5	7542	2406	2358	2358
q6	181	166	138	138
q7	931	759	654	654
q8	9348	1351	1105	1105
q9	6904	5101	5081	5081
q10	6949	2367	1959	1959
q11	495	302	272	272
q12	367	356	227	227
q13	17782	3709	3043	3043
q14	238	254	218	218
q15	578	500	485	485
q16	439	431	404	404
q17	606	857	350	350
q18	7369	7107	7149	7107
q19	1289	959	579	579
q20	357	357	236	236
q21	3852	3230	2387	2387
q22	1058	1008	1008	1008
Total cold run time: 106329 ms
Total hot run time: 34197 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5245	5185	5199	5185
q2	253	332	227	227
q3	2183	2702	2334	2334
q4	1382	1754	1366	1366
q5	4230	4461	4519	4461
q6	220	179	136	136
q7	2035	1979	1900	1900
q8	2729	2586	2571	2571
q9	7281	7505	7340	7340
q10	3073	3355	2875	2875
q11	581	527	512	512
q12	703	820	657	657
q13	3704	3914	3371	3371
q14	284	301	279	279
q15	522	468	481	468
q16	456	497	444	444
q17	1201	1589	1394	1394
q18	7930	7672	7632	7632
q19	901	961	983	961
q20	2068	2097	1849	1849
q21	4865	4432	4396	4396
q22	1089	1059	1020	1020
Total cold run time: 52935 ms
Total hot run time: 51378 ms

924060929
924060929 previously approved these changes Aug 27, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 27, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-DS: Total hot run time: 186206 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 774ecbd9f5054e87d8b85079cbd0265f9f29ee75, data reload: false

query1	1069	477	405	405
query2	6569	1753	1759	1753
query3	6752	235	222	222
query4	26268	23310	23200	23200
query5	4409	677	539	539
query6	349	256	218	218
query7	4663	527	298	298
query8	292	260	236	236
query9	8598	2882	2893	2882
query10	549	350	305	305
query11	15911	15427	14794	14794
query12	177	127	124	124
query13	1671	559	456	456
query14	8587	6014	5824	5824
query15	207	198	178	178
query16	7185	684	480	480
query17	1257	778	664	664
query18	2012	447	352	352
query19	216	212	186	186
query20	139	129	135	129
query21	221	129	123	123
query22	4007	4103	3964	3964
query23	33699	32663	32756	32663
query24	8196	2437	2453	2437
query25	609	548	471	471
query26	1250	280	174	174
query27	2723	523	355	355
query28	4403	2255	2237	2237
query29	823	617	494	494
query30	294	234	206	206
query31	891	818	785	785
query32	91	86	81	81
query33	583	399	370	370
query34	796	853	524	524
query35	810	835	756	756
query36	968	1036	862	862
query37	127	111	96	96
query38	4130	4020	3958	3958
query39	1496	1482	1420	1420
query40	234	139	126	126
query41	72	71	66	66
query42	137	117	119	117
query43	535	535	484	484
query44	1369	869	860	860
query45	185	173	176	173
query46	858	1045	648	648
query47	1778	1820	1725	1725
query48	392	441	329	329
query49	741	523	419	419
query50	665	688	395	395
query51	4048	4120	4044	4044
query52	119	116	107	107
query53	253	288	207	207
query54	611	600	538	538
query55	96	96	92	92
query56	353	337	324	324
query57	1184	1201	1142	1142
query58	300	296	287	287
query59	2718	2733	2563	2563
query60	363	353	353	353
query61	163	156	157	156
query62	855	725	660	660
query63	233	213	203	203
query64	4544	1121	825	825
query65	4349	4252	4234	4234
query66	1182	434	355	355
query67	15297	15159	14994	14994
query68	8760	940	601	601
query69	490	337	303	303
query70	1288	1130	1177	1130
query71	457	362	339	339
query72	5605	4879	4934	4879
query73	732	577	360	360
query74	8911	9114	8889	8889
query75	4094	3090	2646	2646
query76	3690	1238	780	780
query77	803	403	345	345
query78	9682	9799	8810	8810
query79	1837	818	593	593
query80	677	575	523	523
query81	471	259	227	227
query82	434	152	120	120
query83	294	267	253	253
query84	302	109	97	97
query85	866	463	423	423
query86	347	322	314	314
query87	4281	4329	4244	4244
query88	2795	2230	2225	2225
query89	403	330	294	294
query90	1926	232	229	229
query91	199	154	134	134
query92	91	75	73	73
query93	1133	973	665	665
query94	700	419	323	323
query95	406	330	322	322
query96	486	596	281	281
query97	2669	2714	2616	2616
query98	243	222	212	212
query99	1450	1394	1309	1309
Total cold run time: 273153 ms
Total hot run time: 186206 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.4 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 774ecbd9f5054e87d8b85079cbd0265f9f29ee75, data reload: false

query1	0.06	0.04	0.04
query2	0.09	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.12	0.12
query5	0.46	0.43	0.42
query6	1.18	0.64	0.66
query7	0.03	0.02	0.02
query8	0.06	0.04	0.05
query9	0.60	0.53	0.52
query10	0.58	0.58	0.59
query11	0.17	0.12	0.12
query12	0.15	0.12	0.12
query13	0.64	0.64	0.62
query14	0.80	0.83	0.85
query15	0.89	0.85	0.87
query16	0.39	0.40	0.40
query17	1.05	1.04	1.02
query18	0.22	0.20	0.22
query19	1.92	1.80	1.87
query20	0.01	0.01	0.02
query21	15.40	0.97	0.60
query22	0.79	1.17	0.76
query23	14.90	1.38	0.63
query24	7.13	1.17	0.34
query25	0.46	0.33	0.07
query26	0.55	0.17	0.14
query27	0.06	0.06	0.05
query28	9.82	0.95	0.43
query29	12.54	3.99	3.25
query30	3.14	3.08	3.05
query31	2.86	0.59	0.39
query32	3.24	0.56	0.47
query33	3.06	3.12	3.10
query34	16.05	5.45	4.85
query35	4.95	4.94	4.94
query36	0.70	0.50	0.50
query37	0.11	0.08	0.07
query38	0.06	0.05	0.04
query39	0.03	0.02	0.02
query40	0.17	0.16	0.14
query41	0.09	0.03	0.02
query42	0.04	0.03	0.02
query43	0.05	0.04	0.04
Total cold run time: 107.36 s
Total hot run time: 32.4 s

@yujun777 yujun777 force-pushed the fix-merge-project-with-huge-expression branch from 774ecbd to 4de2527 Compare August 28, 2025 06:57
@yujun777
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 28, 2025
@doris-robot
Copy link

TPC-H: Total hot run time: 34087 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4de252711a4a2e1857b55465009a6612881118d2, data reload: false

------ Round 1 ----------------------------------
q1	17680	5308	5056	5056
q2	2037	324	206	206
q3	10217	1271	729	729
q4	10232	1024	538	538
q5	7551	2441	2311	2311
q6	190	179	140	140
q7	933	760	629	629
q8	9338	1315	1129	1129
q9	6930	5157	5107	5107
q10	6964	2399	1988	1988
q11	493	310	293	293
q12	359	362	232	232
q13	17773	3676	3075	3075
q14	234	241	223	223
q15	573	512	508	508
q16	430	428	393	393
q17	586	905	364	364
q18	7425	7081	7001	7001
q19	1420	961	568	568
q20	344	332	232	232
q21	3855	3224	2408	2408
q22	1081	1019	957	957
Total cold run time: 106645 ms
Total hot run time: 34087 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5245	5154	5166	5154
q2	256	333	226	226
q3	2175	2672	2336	2336
q4	1365	1769	1371	1371
q5	4240	4328	4614	4328
q6	224	176	134	134
q7	2015	2007	1880	1880
q8	2660	2592	2666	2592
q9	7383	7413	7291	7291
q10	3178	3327	2908	2908
q11	612	506	505	505
q12	707	806	689	689
q13	3625	4005	3324	3324
q14	280	306	275	275
q15	529	499	473	473
q16	470	506	461	461
q17	1155	1568	1452	1452
q18	7777	7823	7667	7667
q19	853	799	928	799
q20	1974	1992	1834	1834
q21	4846	4332	4404	4332
q22	1065	1033	1020	1020
Total cold run time: 52634 ms
Total hot run time: 51051 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188286 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4de252711a4a2e1857b55465009a6612881118d2, data reload: false

query1	1069	438	424	424
query2	6566	1801	1781	1781
query3	6746	236	239	236
query4	26601	23461	24219	23461
query5	4935	655	540	540
query6	337	261	242	242
query7	4838	511	310	310
query8	316	251	235	235
query9	9552	3021	3071	3021
query10	518	398	294	294
query11	16042	15198	14731	14731
query12	171	121	116	116
query13	1671	576	440	440
query14	9355	5818	5789	5789
query15	208	190	180	180
query16	7353	678	501	501
query17	1202	769	636	636
query18	2030	451	351	351
query19	210	206	182	182
query20	132	129	127	127
query21	234	147	126	126
query22	4245	4254	4100	4100
query23	34351	34106	33447	33447
query24	8293	2395	2426	2395
query25	637	557	489	489
query26	1257	282	177	177
query27	2740	513	417	417
query28	4374	2259	2237	2237
query29	822	624	493	493
query30	290	230	192	192
query31	920	793	782	782
query32	86	82	83	82
query33	587	393	362	362
query34	811	857	521	521
query35	830	843	774	774
query36	984	1012	905	905
query37	120	109	91	91
query38	4003	4057	4052	4052
query39	1510	1463	1427	1427
query40	226	137	124	124
query41	72	65	64	64
query42	126	115	117	115
query43	527	532	483	483
query44	1395	862	871	862
query45	183	170	178	170
query46	882	1025	649	649
query47	1827	1870	1796	1796
query48	388	439	321	321
query49	742	529	427	427
query50	634	697	409	409
query51	4129	4179	4150	4150
query52	111	111	103	103
query53	244	270	197	197
query54	632	617	549	549
query55	95	99	94	94
query56	334	333	310	310
query57	1215	1215	1143	1143
query58	297	283	287	283
query59	2773	2766	2614	2614
query60	364	357	349	349
query61	170	199	158	158
query62	807	739	688	688
query63	228	198	198	198
query64	4551	1145	833	833
query65	4299	4240	4228	4228
query66	1156	438	357	357
query67	15494	15380	15276	15276
query68	9098	953	609	609
query69	492	335	310	310
query70	1265	1128	1144	1128
query71	473	347	321	321
query72	5829	4931	5008	4931
query73	747	592	361	361
query74	8986	9125	8673	8673
query75	4297	3133	2726	2726
query76	3796	1140	762	762
query77	826	416	423	416
query78	9617	9806	8950	8950
query79	1980	876	602	602
query80	665	594	523	523
query81	483	258	239	239
query82	462	141	116	116
query83	276	266	258	258
query84	259	111	91	91
query85	894	464	427	427
query86	347	306	308	306
query87	4225	4327	4157	4157
query88	3167	2250	2256	2250
query89	401	321	310	310
query90	1946	238	234	234
query91	167	164	132	132
query92	91	79	76	76
query93	1117	989	662	662
query94	710	412	336	336
query95	414	334	335	334
query96	491	588	287	287
query97	2724	2677	2623	2623
query98	258	230	269	230
query99	1420	1422	1290	1290
Total cold run time: 278770 ms
Total hot run time: 188286 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4de252711a4a2e1857b55465009a6612881118d2, data reload: false

query1	0.06	0.06	0.05
query2	0.09	0.05	0.06
query3	0.25	0.09	0.09
query4	1.61	0.11	0.12
query5	0.47	0.42	0.43
query6	1.18	0.65	0.65
query7	0.03	0.02	0.02
query8	0.06	0.05	0.04
query9	0.60	0.55	0.54
query10	0.58	0.57	0.58
query11	0.17	0.11	0.11
query12	0.15	0.13	0.13
query13	0.64	0.63	0.62
query14	0.81	0.84	0.85
query15	0.87	0.86	0.88
query16	0.38	0.40	0.40
query17	1.06	1.08	1.04
query18	0.21	0.20	0.20
query19	1.99	1.80	1.82
query20	0.01	0.02	0.01
query21	15.39	0.95	0.59
query22	0.79	1.27	0.68
query23	14.78	1.43	0.67
query24	6.53	1.78	0.66
query25	0.50	0.21	0.15
query26	0.61	0.16	0.14
query27	0.06	0.06	0.06
query28	9.48	0.93	0.42
query29	12.57	3.90	3.23
query30	3.05	3.07	3.04
query31	2.82	0.61	0.39
query32	3.24	0.54	0.48
query33	3.08	3.21	3.09
query34	16.10	5.46	4.85
query35	4.96	4.94	4.97
query36	0.71	0.51	0.51
query37	0.10	0.07	0.07
query38	0.07	0.05	0.04
query39	0.03	0.03	0.04
query40	0.19	0.16	0.14
query41	0.09	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 106.45 s
Total hot run time: 32.8 s

@yujun777 yujun777 changed the title [fix](nereids) fix merge project generate huge expression [fix](nereids) stop merge project when generating huge expression Aug 29, 2025
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 29, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@924060929 924060929 merged commit 758059f into apache:master Sep 1, 2025
31 of 33 checks passed
morrySnow pushed a commit that referenced this pull request Sep 4, 2025
@morrySnow morrySnow mentioned this pull request Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.1-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments