Skip to content

[opt](catalog) Reduce the memory footprint of Column#57401

Merged
morrySnow merged 1 commit intoapache:masterfrom
morrySnow:compress_column_mem
Nov 6, 2025
Merged

[opt](catalog) Reduce the memory footprint of Column#57401
morrySnow merged 1 commit intoapache:masterfrom
morrySnow:compress_column_mem

Conversation

@morrySnow
Copy link
Contributor

@morrySnow morrySnow commented Oct 28, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

remove useless colstat and let empty attribute to be null to reduce the memory footprint of Column

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow morrySnow force-pushed the compress_column_mem branch from 184d6e7 to 26a90d5 Compare October 28, 2025 04:07
@morrySnow
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 190662 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 26a90d541bdb190cbc45eb2a7f7fe011341f3516, data reload: false

query1	1058	435	405	405
query2	6603	1767	1704	1704
query3	6759	228	233	228
query4	26160	23687	23467	23467
query5	5484	635	529	529
query6	372	238	225	225
query7	4643	513	307	307
query8	304	273	269	269
query9	8744	2597	2631	2597
query10	558	364	295	295
query11	16205	15137	14897	14897
query12	184	120	110	110
query13	1668	569	432	432
query14	11605	9323	9398	9323
query15	216	199	176	176
query16	7687	680	524	524
query17	1620	805	680	680
query18	2138	466	365	365
query19	257	243	208	208
query20	147	139	148	139
query21	231	146	123	123
query22	4766	4613	4528	4528
query23	35200	33622	33991	33622
query24	8465	2546	2546	2546
query25	643	570	495	495
query26	1273	296	179	179
query27	2731	546	369	369
query28	4546	2297	2274	2274
query29	803	662	521	521
query30	325	233	214	214
query31	925	915	776	776
query32	88	77	83	77
query33	614	431	349	349
query34	845	893	556	556
query35	837	902	810	810
query36	1004	1043	927	927
query37	134	118	117	117
query38	3689	3736	3492	3492
query39	1531	1448	1438	1438
query40	216	132	117	117
query41	61	62	63	62
query42	122	118	114	114
query43	494	518	469	469
query44	1260	765	745	745
query45	181	177	180	177
query46	926	1027	667	667
query47	1732	1792	1743	1743
query48	394	429	321	321
query49	774	523	418	418
query50	690	708	413	413
query51	3952	3931	3919	3919
query52	115	115	102	102
query53	253	270	201	201
query54	620	602	540	540
query55	88	85	87	85
query56	328	331	320	320
query57	1182	1190	1110	1110
query58	290	289	269	269
query59	2596	2646	2567	2567
query60	363	342	346	342
query61	161	170	151	151
query62	828	777	673	673
query63	245	204	204	204
query64	4530	1245	882	882
query65	4028	3958	3955	3955
query66	1063	423	331	331
query67	15348	15022	14911	14911
query68	8323	966	606	606
query69	511	338	307	307
query70	1313	1348	1322	1322
query71	538	352	331	331
query72	5896	4917	4793	4793
query73	702	579	371	371
query74	9244	9070	8825	8825
query75	4095	3338	2837	2837
query76	3787	1182	775	775
query77	834	418	320	320
query78	9579	9596	8854	8854
query79	2755	799	612	612
query80	711	576	510	510
query81	492	254	235	235
query82	456	175	135	135
query83	301	280	254	254
query84	299	118	100	100
query85	898	483	439	439
query86	336	324	300	300
query87	3779	3763	3587	3587
query88	3034	2282	2282	2282
query89	406	321	303	303
query90	2067	224	222	222
query91	170	172	136	136
query92	85	71	70	70
query93	1265	984	648	648
query94	708	458	337	337
query95	409	336	318	318
query96	504	574	281	281
query97	2946	2976	2880	2880
query98	229	213	220	213
query99	1420	1426	1358	1358
Total cold run time: 281994 ms
Total hot run time: 190662 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.54 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 26a90d541bdb190cbc45eb2a7f7fe011341f3516, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.06
query3	0.26	0.09	0.08
query4	1.61	0.12	0.12
query5	0.28	0.27	0.25
query6	1.18	0.66	0.63
query7	0.03	0.02	0.02
query8	0.05	0.04	0.04
query9	0.63	0.52	0.51
query10	0.59	0.57	0.59
query11	0.17	0.11	0.11
query12	0.15	0.12	0.12
query13	0.62	0.61	0.61
query14	1.01	1.01	0.99
query15	0.84	0.82	0.85
query16	0.40	0.40	0.41
query17	0.99	1.01	1.02
query18	0.21	0.20	0.20
query19	1.91	1.85	1.87
query20	0.02	0.01	0.01
query21	15.42	0.17	0.14
query22	5.13	0.07	0.05
query23	15.66	0.26	0.10
query24	3.14	0.40	1.43
query25	0.08	0.06	0.06
query26	0.14	0.14	0.14
query27	0.07	0.05	0.06
query28	5.37	1.13	0.95
query29	12.59	3.94	3.27
query30	0.28	0.14	0.12
query31	2.82	0.59	0.38
query32	3.23	0.57	0.46
query33	3.07	2.97	3.09
query34	15.81	5.19	4.56
query35	4.54	4.57	4.57
query36	0.67	0.51	0.50
query37	0.09	0.07	0.07
query38	0.07	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.08	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 99.67 s
Total hot run time: 27.54 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 17.14% (12/70) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 78.57% (55/70) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 74.29% (52/70) 🎉
Increment coverage report
Complete coverage report

@morrySnow morrySnow force-pushed the compress_column_mem branch from 26a90d5 to d9b01da Compare October 30, 2025 06:27
@morrySnow
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 29.31 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d9b01dafee3f1b083783a1ad34230514fd1d057e, data reload: false

query1	0.07	0.06	0.06
query2	0.11	0.06	0.05
query3	0.26	0.10	0.09
query4	1.61	0.12	0.12
query5	0.29	0.28	0.27
query6	1.20	0.69	0.68
query7	0.04	0.03	0.03
query8	0.07	0.05	0.06
query9	0.67	0.56	0.57
query10	0.62	0.62	0.63
query11	0.19	0.13	0.13
query12	0.18	0.14	0.14
query13	0.63	0.64	0.63
query14	1.06	1.04	1.02
query15	0.94	0.89	0.90
query16	0.42	0.46	0.44
query17	1.12	1.27	1.15
query18	0.23	0.22	0.22
query19	2.01	1.93	1.86
query20	0.02	0.01	0.02
query21	15.39	0.20	0.14
query22	5.08	0.09	0.05
query23	15.69	0.31	0.12
query24	2.51	0.63	0.93
query25	0.08	0.07	0.08
query26	0.17	0.15	0.15
query27	0.07	0.07	0.06
query28	4.85	1.21	0.97
query29	12.67	4.66	3.82
query30	0.30	0.15	0.12
query31	2.84	0.67	0.42
query32	3.24	0.58	0.48
query33	3.18	3.15	3.23
query34	15.91	5.22	4.61
query35	4.65	4.67	4.61
query36	0.69	0.54	0.51
query37	0.11	0.07	0.08
query38	0.07	0.04	0.04
query39	0.04	0.04	0.03
query40	0.18	0.15	0.14
query41	0.09	0.04	0.03
query42	0.05	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 99.65 s
Total hot run time: 29.31 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 64.29% (45/70) 🎉
Increment coverage report
Complete coverage report

@morrySnow morrySnow force-pushed the compress_column_mem branch from d9b01da to ab0b741 Compare November 4, 2025 04:06
@morrySnow
Copy link
Contributor Author

run buildall

@morrySnow morrySnow marked this pull request as ready for review November 4, 2025 04:07
@doris-robot
Copy link

TPC-DS: Total hot run time: 190019 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ab0b741e3d9068e12730f475e78f1e4023d62313, data reload: false

query1	1046	409	393	393
query2	6585	1709	1731	1709
query3	6759	228	226	226
query4	26404	23378	23377	23377
query5	4864	604	472	472
query6	328	253	225	225
query7	4676	494	293	293
query8	302	265	254	254
query9	8724	2566	2573	2566
query10	543	354	305	305
query11	15713	15039	14924	14924
query12	206	118	115	115
query13	1687	570	445	445
query14	12143	9551	9588	9551
query15	209	190	172	172
query16	7703	688	539	539
query17	1633	803	644	644
query18	2082	473	389	389
query19	257	226	191	191
query20	142	139	128	128
query21	214	138	129	129
query22	4433	4581	4424	4424
query23	35024	33811	33852	33811
query24	8525	2584	2556	2556
query25	710	563	474	474
query26	1293	281	157	157
query27	2923	525	375	375
query28	4402	2221	2211	2211
query29	805	638	509	509
query30	303	239	211	211
query31	979	868	784	784
query32	82	73	69	69
query33	595	443	341	341
query34	833	877	578	578
query35	850	851	812	812
query36	972	1001	894	894
query37	118	113	82	82
query38	3525	3607	3499	3499
query39	1454	1425	1604	1425
query40	219	130	116	116
query41	59	59	57	57
query42	128	110	107	107
query43	479	483	471	471
query44	1218	747	741	741
query45	180	179	179	179
query46	882	982	650	650
query47	1736	1793	1713	1713
query48	382	428	333	333
query49	780	499	436	436
query50	635	682	401	401
query51	3905	3982	3858	3858
query52	109	110	100	100
query53	236	272	196	196
query54	324	294	275	275
query55	84	87	81	81
query56	314	316	324	316
query57	1153	1177	1099	1099
query58	296	286	276	276
query59	2524	2720	2588	2588
query60	361	346	328	328
query61	163	164	162	162
query62	809	725	673	673
query63	228	197	194	194
query64	4451	1198	844	844
query65	4020	3976	3960	3960
query66	1094	454	339	339
query67	15313	15024	14796	14796
query68	8471	904	594	594
query69	488	310	293	293
query70	1350	1279	1237	1237
query71	497	346	334	334
query72	6022	5011	5034	5011
query73	716	623	363	363
query74	8899	9205	8735	8735
query75	3920	3380	2827	2827
query76	3704	1164	743	743
query77	810	446	317	317
query78	9645	9685	8892	8892
query79	3344	804	612	612
query80	730	580	504	504
query81	524	266	227	227
query82	487	164	139	139
query83	302	276	253	253
query84	309	122	89	89
query85	936	483	442	442
query86	394	301	293	293
query87	3803	3764	3675	3675
query88	3438	2250	2226	2226
query89	442	328	302	302
query90	1983	226	229	226
query91	188	170	140	140
query92	82	74	63	63
query93	2919	990	628	628
query94	687	459	339	339
query95	411	314	318	314
query96	486	587	277	277
query97	2969	2984	2864	2864
query98	238	221	213	213
query99	1486	1429	1299	1299
Total cold run time: 282808 ms
Total hot run time: 190019 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.51 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ab0b741e3d9068e12730f475e78f1e4023d62313, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.12	0.12
query5	0.27	0.27	0.25
query6	1.18	0.64	0.64
query7	0.03	0.02	0.03
query8	0.06	0.04	0.04
query9	0.59	0.52	0.51
query10	0.57	0.57	0.58
query11	0.16	0.12	0.11
query12	0.15	0.12	0.12
query13	0.63	0.61	0.61
query14	1.00	1.00	0.99
query15	0.86	0.84	0.83
query16	0.40	0.40	0.39
query17	1.02	1.04	1.01
query18	0.22	0.20	0.20
query19	1.91	1.80	1.78
query20	0.01	0.02	0.01
query21	15.45	0.21	0.13
query22	4.93	0.07	0.04
query23	15.71	0.26	0.10
query24	3.52	0.46	0.75
query25	0.08	0.06	0.05
query26	0.15	0.13	0.14
query27	0.06	0.06	0.05
query28	5.21	1.11	0.92
query29	12.58	3.87	3.27
query30	0.27	0.13	0.11
query31	2.81	0.58	0.39
query32	3.23	0.55	0.46
query33	3.19	3.01	3.02
query34	15.86	5.20	4.59
query35	4.57	4.56	4.63
query36	0.67	0.51	0.50
query37	0.09	0.07	0.07
query38	0.06	0.04	0.03
query39	0.04	0.03	0.03
query40	0.18	0.14	0.14
query41	0.08	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 99.88 s
Total hot run time: 27.51 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 78.57% (55/70) 🎉
Increment coverage report
Complete coverage report

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 78.57% (55/70) 🎉
Increment coverage report
Complete coverage report

this.defaultValueExprDef = defaultValueExprDef;
this.comment = comment;
this.stats = new ColumnStats();
this.comment = StringUtils.isBlank(comment) ? null : comment;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use Strings.emptyToNull

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 5, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2025

PR approved by at least one committer and no changes requested.

@morrySnow morrySnow merged commit 7adc3be into apache:master Nov 6, 2025
29 checks passed
@morrySnow morrySnow deleted the compress_column_mem branch November 6, 2025 11:27
github-actions bot pushed a commit that referenced this pull request Nov 6, 2025
remove useless colstat and let empty attribute to be null to reduce the
memory footprint of Column
yiguolei pushed a commit that referenced this pull request Nov 8, 2025
… (#57766)

Cherry-picked from #57401

Co-authored-by: morrySnow <zhangwenxin@selectdb.com>
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Nov 18, 2025
remove useless colstat and let empty attribute to be null to reduce the
memory footprint of Column
@yiguolei yiguolei mentioned this pull request Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants