Skip to content

[feature](iceberg) Support Partition Evolution DDL for Iceberg Tables#57972

Merged
morningman merged 13 commits intoapache:masterfrom
suxiaogang223:iceberg_partition_evolution_ddl
Nov 26, 2025
Merged

[feature](iceberg) Support Partition Evolution DDL for Iceberg Tables#57972
morningman merged 13 commits intoapache:masterfrom
suxiaogang223:iceberg_partition_evolution_ddl

Conversation

@suxiaogang223
Copy link
Contributor

@suxiaogang223 suxiaogang223 commented Nov 13, 2025

What problem does this PR solve?

Summary

This PR implements support for partition evolution in Iceberg tables, allowing users to dynamically modify table partition strategies without rewriting data files. This is a metadata-only operation that maintains multiple partition spec versions.

Background

Apache Iceberg supports partition evolution, which enables changing partition strategies on existing tables without data migration. Doris, as a query engine for Iceberg, needs to support SQL syntax for partition evolution operations to provide users with flexible partition management.

Features

Core Functionality

  • Add Partition Field: Add new partition field to existing partition specifications
  • Drop Partition Field: Remove partition field from existing partition specifications
  • Replace Partition Field:Replace partition field from existing partition specifications with new partition field

Design Principles

  1. Metadata-only operation: Partition evolution only updates table metadata, no data files are rewritten
  2. Backward compatibility: Historical data retains original partition specs, new data uses new partition specs
  3. Syntax compatibility: Follows Spark SQL ALTER TABLE syntax for consistency

Syntax

Add Partition Field

-- use optional AS keyword to specify a custom name for the partition field
ALTER TABLE table_name ADD PARTITION KEY partition_transform [AS key_name];

-- example
ALTER TABLE prod.db.sample ADD PARTITION KEY bucket(16, id);
ALTER TABLE prod.db.sample ADD PARTITION KEY truncate(4, data);
ALTER TABLE prod.db.sample ADD PARTITION KEY year(ts);
-- use optional AS keyword to specify a custom name for the partition field 
ALTER TABLE prod.db.sample ADD PARTITION KEY bucket(16, id) AS shard;

Drop Partition Field

ALTER TABLE table_name DROP PARTITION KEY partition_transform|key_name;

-- example
ALTER TABLE prod.db.sample DROP PARTITION KEY catalog;
ALTER TABLE prod.db.sample DROP PARTITION KEY bucket(16, id);
ALTER TABLE prod.db.sample DROP PARTITION KEY truncate(4, data);
ALTER TABLE prod.db.sample DROP PARTITION KEY year(ts);
ALTER TABLE prod.db.sample DROP PARTITION KEY shard;

Replace Partition Field

-- use optional AS keyword to specify a custom name for the partition field
ALTER TABLE table_name REPLACE PARTITION KEY key_name WITH partition_transform [AS key_name];

-- example
ALTER TABLE prod.db.sample REPLACE PARTITION KEY ts_day WITH day(ts);
-- use optional AS keyword to specify a custom name for the new partition field 
ALTER TABLE prod.db.sample REPLACE PARTITION KEY ts_day WITH day(ts) AS day_of_ts;

Supported Partition Transforms

Transform Syntax Example
bucket bucket(N, column) bucket(16, id)
truncate truncate(N, column) truncate(10, name)
year year(column) year(ts)
month month(column) month(ts)
day day(column) day(ts)
hour hour(column) hour(ts)
identity column category

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Nov 13, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@suxiaogang223 suxiaogang223 force-pushed the iceberg_partition_evolution_ddl branch from 3c485cb to 7d998d1 Compare November 13, 2025 02:46
@suxiaogang223
Copy link
Contributor Author

run buildall

Copy link
Contributor

@zddr zddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference regression-test/suites/mtmv_p0/test_iceberg_mtmv.groovy to add some partition evolution cases to validate the behavior of MTMV.
For example:

  1. Without changing the partition column, but switching from day to year, should be allowed. The materialized view should refresh normally and generate correct partitions.
  2. Changing the partition column from c1 to c2 should cause the materialized view refresh to fail.
  3. Switching from year to identity should cause the materialized view refresh to fail.

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34247 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9c6acdd0dc72c1aeedc6b9c4b2b2aa32379c8fef, data reload: false

------ Round 1 ----------------------------------
q1	17666	5199	5277	5199
q2	2006	345	202	202
q3	10224	1290	722	722
q4	10238	924	368	368
q5	7613	2313	2373	2313
q6	187	172	132	132
q7	899	777	620	620
q8	9342	1327	1118	1118
q9	6775	5146	5167	5146
q10	6818	2250	1803	1803
q11	493	298	295	295
q12	328	371	226	226
q13	17780	3664	3026	3026
q14	226	228	211	211
q15	561	502	500	500
q16	1030	986	958	958
q17	584	873	373	373
q18	7460	7216	7007	7007
q19	1077	953	566	566
q20	352	343	223	223
q21	3796	3238	2271	2271
q22	1064	1004	968	968
Total cold run time: 106519 ms
Total hot run time: 34247 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5182	5129	5171	5129
q2	247	321	231	231
q3	2181	2668	2315	2315
q4	1381	1762	1313	1313
q5	4231	4429	4642	4429
q6	228	180	136	136
q7	2019	2056	1808	1808
q8	2643	2581	2599	2581
q9	7299	7342	7224	7224
q10	3052	3375	2867	2867
q11	583	524	509	509
q12	710	787	658	658
q13	3555	4049	3349	3349
q14	289	334	291	291
q15	560	514	496	496
q16	1084	1130	1097	1097
q17	1183	1588	1385	1385
q18	7982	7551	7548	7548
q19	794	746	810	746
q20	2025	2038	1996	1996
q21	5007	4521	4402	4402
q22	1072	1031	1013	1013
Total cold run time: 53307 ms
Total hot run time: 51523 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188470 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9c6acdd0dc72c1aeedc6b9c4b2b2aa32379c8fef, data reload: false

query1	1041	411	433	411
query2	6555	1710	1689	1689
query3	6750	234	230	230
query4	26584	23719	23553	23553
query5	4567	640	469	469
query6	333	235	235	235
query7	4647	499	294	294
query8	301	249	246	246
query9	8734	2606	2613	2606
query10	526	357	295	295
query11	15752	15093	15093	15093
query12	179	124	114	114
query13	1674	554	430	430
query14	11284	9382	9444	9382
query15	202	189	171	171
query16	7628	688	515	515
query17	1220	759	625	625
query18	2055	411	318	318
query19	214	203	173	173
query20	129	130	125	125
query21	233	131	112	112
query22	3987	3938	3998	3938
query23	33953	33089	32862	32862
query24	8476	2451	2402	2402
query25	601	514	450	450
query26	1238	270	160	160
query27	2757	496	343	343
query28	4371	2274	2211	2211
query29	776	605	484	484
query30	298	225	202	202
query31	911	809	727	727
query32	86	72	76	72
query33	607	379	327	327
query34	795	860	519	519
query35	828	821	749	749
query36	959	1006	938	938
query37	122	109	93	93
query38	3566	3600	3457	3457
query39	1486	1414	1440	1414
query40	224	127	132	127
query41	64	60	62	60
query42	131	116	113	113
query43	498	501	482	482
query44	1238	754	740	740
query45	192	186	175	175
query46	881	995	644	644
query47	1751	1795	1707	1707
query48	391	426	324	324
query49	811	519	427	427
query50	648	693	418	418
query51	3898	3943	3852	3852
query52	116	111	107	107
query53	247	273	208	208
query54	323	307	295	295
query55	95	88	90	88
query56	340	330	343	330
query57	1183	1207	1116	1116
query58	299	286	277	277
query59	2513	2719	2642	2642
query60	362	353	348	348
query61	209	189	184	184
query62	797	728	683	683
query63	241	199	200	199
query64	4603	1261	975	975
query65	4060	3951	4006	3951
query66	1127	469	356	356
query67	15325	15367	15101	15101
query68	8295	931	603	603
query69	497	346	290	290
query70	1419	1213	1276	1213
query71	460	346	325	325
query72	6227	4946	4917	4917
query73	642	601	364	364
query74	8831	9147	8652	8652
query75	3396	3326	2813	2813
query76	3381	1166	739	739
query77	588	406	324	324
query78	9458	9581	8801	8801
query79	2744	834	615	615
query80	700	574	505	505
query81	509	263	240	240
query82	485	166	139	139
query83	301	263	246	246
query84	298	119	101	101
query85	916	494	442	442
query86	388	312	301	301
query87	3838	3752	3620	3620
query88	3491	2235	2221	2221
query89	407	340	298	298
query90	1979	220	217	217
query91	165	170	137	137
query92	87	70	69	69
query93	2089	976	648	648
query94	739	447	336	336
query95	411	319	319	319
query96	487	575	279	279
query97	2944	2975	2840	2840
query98	246	224	219	219
query99	1457	1384	1274	1274
Total cold run time: 276867 ms
Total hot run time: 188470 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.43 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9c6acdd0dc72c1aeedc6b9c4b2b2aa32379c8fef, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.08	0.08
query4	1.61	0.11	0.11
query5	0.26	0.25	0.26
query6	1.19	0.66	0.62
query7	0.03	0.03	0.02
query8	0.05	0.04	0.04
query9	0.57	0.53	0.52
query10	0.57	0.58	0.57
query11	0.16	0.11	0.12
query12	0.15	0.11	0.11
query13	0.62	0.60	0.61
query14	0.99	1.00	1.00
query15	0.85	0.83	0.83
query16	0.39	0.39	0.39
query17	1.05	1.00	1.02
query18	0.21	0.20	0.20
query19	1.90	1.78	1.79
query20	0.02	0.01	0.02
query21	15.42	0.19	0.12
query22	5.00	0.08	0.05
query23	15.69	0.27	0.10
query24	2.70	1.15	0.40
query25	0.08	0.06	0.05
query26	0.15	0.14	0.14
query27	0.07	0.06	0.05
query28	4.97	1.14	0.92
query29	12.60	3.86	3.20
query30	0.29	0.13	0.12
query31	2.81	0.57	0.38
query32	3.23	0.56	0.48
query33	3.01	3.10	3.12
query34	15.88	5.14	4.53
query35	4.56	4.60	4.66
query36	0.68	0.51	0.49
query37	0.10	0.06	0.07
query38	0.08	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.08	0.03	0.03
query42	0.04	0.04	0.03
query43	0.04	0.03	0.03
Total cold run time: 98.74 s
Total hot run time: 27.43 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 26.05% (62/238) 🎉
Increment coverage report
Complete coverage report

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34489 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 942fe3532a0c11b75232c7e5d4ce0d5d094076bd, data reload: false

------ Round 1 ----------------------------------
q1	17598	5251	5198	5198
q2	2027	358	207	207
q3	10208	1400	747	747
q4	10227	889	372	372
q5	7475	2351	2370	2351
q6	186	170	134	134
q7	904	771	630	630
q8	9345	1367	1119	1119
q9	6862	5186	5095	5095
q10	6836	2221	1824	1824
q11	506	311	284	284
q12	333	375	244	244
q13	17780	3655	3074	3074
q14	226	227	212	212
q15	569	519	514	514
q16	1049	1014	933	933
q17	575	860	366	366
q18	7401	7237	6995	6995
q19	1143	949	597	597
q20	359	357	230	230
q21	3945	3239	2382	2382
q22	1072	1052	981	981
Total cold run time: 106626 ms
Total hot run time: 34489 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5203	5099	5161	5099
q2	253	335	237	237
q3	2170	2719	2343	2343
q4	1357	1799	1327	1327
q5	4191	4460	4659	4460
q6	214	179	135	135
q7	2078	2003	1822	1822
q8	2613	2583	2532	2532
q9	7343	7348	7345	7345
q10	3151	3246	2865	2865
q11	579	527	515	515
q12	703	771	651	651
q13	3575	3966	3348	3348
q14	286	295	282	282
q15	547	515	501	501
q16	1073	1154	1062	1062
q17	1244	1594	1394	1394
q18	7932	7736	7528	7528
q19	833	827	840	827
q20	2058	2057	1913	1913
q21	4893	4401	4313	4313
q22	1091	1025	1013	1013
Total cold run time: 53387 ms
Total hot run time: 51512 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187845 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 942fe3532a0c11b75232c7e5d4ce0d5d094076bd, data reload: false

query1	1026	399	407	399
query2	6555	1716	1714	1714
query3	6758	232	232	232
query4	26357	23666	23252	23252
query5	4436	617	477	477
query6	359	243	245	243
query7	4642	504	327	327
query8	316	261	255	255
query9	8720	2612	2621	2612
query10	489	351	291	291
query11	15223	15106	14951	14951
query12	197	118	112	112
query13	1680	574	454	454
query14	10843	9326	9267	9267
query15	204	188	175	175
query16	7759	677	452	452
query17	1229	748	613	613
query18	2028	416	313	313
query19	207	200	170	170
query20	127	125	122	122
query21	212	131	113	113
query22	4258	4075	3954	3954
query23	33907	32945	32875	32875
query24	8534	2413	2411	2411
query25	580	512	444	444
query26	1229	275	159	159
query27	2758	506	352	352
query28	4346	2213	2203	2203
query29	770	611	494	494
query30	301	226	200	200
query31	921	778	700	700
query32	86	75	74	74
query33	595	377	320	320
query34	796	867	523	523
query35	825	838	756	756
query36	970	1008	890	890
query37	117	108	87	87
query38	3541	3565	3471	3471
query39	1465	1407	1423	1407
query40	224	137	118	118
query41	64	61	61	61
query42	137	119	114	114
query43	490	519	463	463
query44	1230	743	745	743
query45	184	183	172	172
query46	873	987	630	630
query47	1752	1787	1736	1736
query48	383	416	335	335
query49	798	469	403	403
query50	656	689	422	422
query51	3931	3893	3965	3893
query52	111	120	102	102
query53	233	268	198	198
query54	303	303	275	275
query55	86	89	85	85
query56	326	305	318	305
query57	1192	1202	1140	1140
query58	292	273	268	268
query59	2625	2787	2615	2615
query60	348	340	333	333
query61	161	159	151	151
query62	805	719	695	695
query63	232	197	198	197
query64	4438	1158	864	864
query65	4100	3957	3926	3926
query66	1097	438	337	337
query67	15189	15030	14817	14817
query68	4647	961	617	617
query69	506	332	293	293
query70	1341	1277	1286	1277
query71	410	349	336	336
query72	6111	5217	5225	5217
query73	659	593	361	361
query74	9173	8789	8690	8690
query75	3324	3319	2818	2818
query76	3320	1143	746	746
query77	518	403	322	322
query78	9574	9763	8812	8812
query79	1875	827	594	594
query80	715	561	491	491
query81	522	262	236	236
query82	240	159	130	130
query83	278	261	240	240
query84	252	111	91	91
query85	875	484	433	433
query86	380	301	303	301
query87	3720	3657	3589	3589
query88	2845	2317	2286	2286
query89	387	325	289	289
query90	1867	217	213	213
query91	171	165	138	138
query92	78	68	68	68
query93	1319	990	645	645
query94	673	458	344	344
query95	405	326	310	310
query96	482	560	282	282
query97	2895	2950	2897	2897
query98	234	211	203	203
query99	1313	1406	1314	1314
Total cold run time: 268735 ms
Total hot run time: 187845 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.51 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 942fe3532a0c11b75232c7e5d4ce0d5d094076bd, data reload: false

query1	0.05	0.04	0.05
query2	0.10	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.11	0.11
query5	0.27	0.26	0.25
query6	1.18	0.64	0.65
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.57	0.53	0.52
query10	0.57	0.58	0.58
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.62	0.62	0.60
query14	1.00	1.01	1.00
query15	0.86	0.83	0.82
query16	0.40	0.39	0.38
query17	1.01	1.05	1.02
query18	0.21	0.20	0.20
query19	1.87	1.84	1.79
query20	0.02	0.01	0.02
query21	15.46	0.22	0.13
query22	4.99	0.07	0.05
query23	15.65	0.26	0.10
query24	3.15	0.88	0.48
query25	0.08	0.06	0.06
query26	0.14	0.14	0.13
query27	0.06	0.05	0.06
query28	4.97	1.14	0.94
query29	12.64	3.85	3.20
query30	0.28	0.13	0.11
query31	2.81	0.61	0.38
query32	3.24	0.55	0.46
query33	3.08	3.04	3.05
query34	15.87	5.16	4.54
query35	4.54	4.63	4.62
query36	0.66	0.50	0.48
query37	0.10	0.07	0.06
query38	0.06	0.04	0.04
query39	0.03	0.02	0.03
query40	0.18	0.14	0.14
query41	0.08	0.03	0.03
query42	0.03	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 99.13 s
Total hot run time: 27.51 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 26.02% (64/246) 🎉
Increment coverage report
Complete coverage report

@suxiaogang223 suxiaogang223 force-pushed the iceberg_partition_evolution_ddl branch 2 times, most recently from 57da735 to 26b5aef Compare November 19, 2025 03:28
@suxiaogang223 suxiaogang223 force-pushed the iceberg_partition_evolution_ddl branch from 26b5aef to f2cf736 Compare November 19, 2025 03:29
@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34708 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 972835f16c625985e97ceb9bdb1f5e28db0fcc40, data reload: false

------ Round 1 ----------------------------------
q1	17620	5042	4914	4914
q2	2075	320	203	203
q3	10216	1323	760	760
q4	10260	988	377	377
q5	7529	2415	2378	2378
q6	185	172	140	140
q7	916	768	632	632
q8	9359	1342	1188	1188
q9	7127	5421	5374	5374
q10	6823	2249	1820	1820
q11	502	308	284	284
q12	349	377	231	231
q13	17784	3709	3050	3050
q14	249	234	212	212
q15	580	520	510	510
q16	996	1000	950	950
q17	577	898	378	378
q18	7385	7175	7125	7125
q19	1230	975	578	578
q20	362	370	258	258
q21	4076	3267	2366	2366
q22	1068	1021	980	980
Total cold run time: 107268 ms
Total hot run time: 34708 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4985	4907	5211	4907
q2	338	402	328	328
q3	2190	2681	2337	2337
q4	1338	1752	1307	1307
q5	4242	4560	4563	4560
q6	213	174	138	138
q7	2099	1984	1789	1789
q8	2752	2614	2641	2614
q9	7600	7503	7450	7450
q10	3109	3320	2812	2812
q11	614	549	510	510
q12	722	779	610	610
q13	3604	4242	3218	3218
q14	295	306	283	283
q15	569	523	567	523
q16	1061	1125	1067	1067
q17	1197	1536	1413	1413
q18	7866	7856	7557	7557
q19	834	866	1013	866
q20	2042	2056	1983	1983
q21	4969	4508	4218	4218
q22	1056	1007	999	999
Total cold run time: 53695 ms
Total hot run time: 51489 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187048 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 972835f16c625985e97ceb9bdb1f5e28db0fcc40, data reload: false

query1	1067	454	398	398
query2	6556	1724	1668	1668
query3	6758	223	219	219
query4	26057	23252	22697	22697
query5	4435	649	471	471
query6	342	234	218	218
query7	4656	501	299	299
query8	322	247	273	247
query9	8706	2607	2589	2589
query10	485	327	289	289
query11	15744	15151	14853	14853
query12	172	113	109	109
query13	1680	576	437	437
query14	10244	9118	9110	9110
query15	196	181	173	173
query16	7148	681	530	530
query17	965	764	622	622
query18	1978	447	324	324
query19	213	198	174	174
query20	129	125	124	124
query21	225	133	112	112
query22	3957	3983	3994	3983
query23	34154	33180	32906	32906
query24	8522	2413	2392	2392
query25	622	531	457	457
query26	1232	279	164	164
query27	2752	486	354	354
query28	4433	2242	2212	2212
query29	872	608	500	500
query30	305	225	201	201
query31	905	795	689	689
query32	84	73	71	71
query33	587	368	338	338
query34	797	867	522	522
query35	796	848	737	737
query36	970	983	903	903
query37	125	112	87	87
query38	3554	3569	3442	3442
query39	1478	1417	1412	1412
query40	222	126	123	123
query41	66	62	66	62
query42	133	117	110	110
query43	490	509	464	464
query44	1252	780	768	768
query45	191	181	180	180
query46	874	989	642	642
query47	1764	1785	1722	1722
query48	402	430	330	330
query49	772	488	411	411
query50	666	687	413	413
query51	4063	3932	3828	3828
query52	113	108	107	107
query53	243	267	197	197
query54	323	312	307	307
query55	92	89	86	86
query56	345	338	338	338
query57	1170	1196	1125	1125
query58	309	295	286	286
query59	2587	2664	2476	2476
query60	381	376	355	355
query61	199	202	201	201
query62	820	732	672	672
query63	238	200	194	194
query64	4669	1294	1000	1000
query65	4031	3980	3985	3980
query66	1201	454	353	353
query67	15272	15014	14920	14920
query68	8242	950	637	637
query69	506	333	298	298
query70	1294	1297	1289	1289
query71	459	362	329	329
query72	6053	4997	4962	4962
query73	634	617	370	370
query74	8874	9025	8765	8765
query75	3568	3311	2843	2843
query76	3324	1138	754	754
query77	646	412	305	305
query78	9540	9730	8794	8794
query79	2243	843	610	610
query80	718	578	497	497
query81	511	274	233	233
query82	204	162	142	142
query83	272	270	256	256
query84	304	120	92	92
query85	879	482	457	457
query86	380	321	303	303
query87	3742	3683	3572	3572
query88	3285	2201	2205	2201
query89	374	322	293	293
query90	2008	216	204	204
query91	180	167	145	145
query92	88	71	66	66
query93	2242	1038	683	683
query94	706	462	349	349
query95	399	325	312	312
query96	486	568	279	279
query97	2887	2953	2851	2851
query98	274	216	206	206
query99	1269	1408	1268	1268
Total cold run time: 273702 ms
Total hot run time: 187048 ms

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements support for partition evolution in Apache Iceberg tables, enabling users to dynamically modify table partition strategies through DDL operations (ADD/DROP/REPLACE PARTITION KEY) without rewriting data files. The implementation follows Spark SQL syntax for consistency and maintains backward compatibility with historical data.

Key Changes:

  • Added three new ALTER TABLE operations for partition field management (ADD/DROP/REPLACE)
  • Extended grammar with partition transform expressions supporting bucket, truncate, year, month, day, hour, and identity transforms
  • Implemented Iceberg-specific metadata operations through IcebergMetadataOps
  • Added comprehensive test coverage including DDL tests, query/write tests, and MTMV compatibility tests

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
DorisParser.g4 Added grammar rules for ADD/DROP/REPLACE PARTITION KEY with partition transform expressions
LogicalPlanBuilder.java Implemented parser logic to extract partition field information and create operation objects
Add/Drop/ReplacePartitionFieldOp.java Nereids operation classes for partition evolution with validation and SQL generation
Add/Drop/ReplacePartitionFieldClause.java Analysis clause classes for legacy planner compatibility
AlterTableCommand.java Extended to support partition field operations for external tables
Alter.java Added validation to reject partition field operations on internal tables and handle external table operations
IcebergMetadataOps.java Core implementation using Iceberg's UpdatePartitionSpec API with transform support
IcebergExternalCatalog.java Added public methods to invoke partition evolution operations
IcebergExternalTable.java Removed caching mechanism from isValidRelatedTable() to handle partition evolution
AlterOpType.java Added three new operation types for partition evolution
test_iceberg_partition_evolution_*.groovy Comprehensive test suites covering DDL, query/write, and MTMV scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34838 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ef1f5d2b10dd182d10bdaeb28b824376cb4f6c29, data reload: false

------ Round 1 ----------------------------------
q1	17587	5100	4930	4930
q2	2097	315	205	205
q3	10209	1260	694	694
q4	10237	893	359	359
q5	7533	2390	2327	2327
q6	185	171	137	137
q7	925	764	632	632
q8	9352	1313	1031	1031
q9	7080	5376	5323	5323
q10	6837	2209	1814	1814
q11	493	295	275	275
q12	327	364	233	233
q13	17786	3619	3002	3002
q14	226	239	211	211
q15	571	509	506	506
q16	1023	987	943	943
q17	589	890	359	359
q18	7535	7596	8087	7596
q19	1382	971	613	613
q20	371	355	246	246
q21	4077	3293	2375	2375
q22	1088	1098	1027	1027
Total cold run time: 107510 ms
Total hot run time: 34838 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5348	5139	5210	5139
q2	338	412	339	339
q3	2398	2837	2487	2487
q4	1438	1939	1449	1449
q5	4568	4494	4382	4382
q6	216	162	123	123
q7	2063	2020	1732	1732
q8	2697	2662	2497	2497
q9	7567	7671	7426	7426
q10	3081	3252	2817	2817
q11	564	506	474	474
q12	619	696	544	544
q13	3223	3636	3016	3016
q14	268	274	255	255
q15	533	479	476	476
q16	1008	1062	1004	1004
q17	1088	1512	1298	1298
q18	7301	7116	6986	6986
q19	776	756	880	756
q20	1914	1946	1806	1806
q21	4636	4332	4329	4329
q22	1065	1049	972	972
Total cold run time: 52709 ms
Total hot run time: 50307 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187181 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ef1f5d2b10dd182d10bdaeb28b824376cb4f6c29, data reload: false

query1	1055	433	386	386
query2	6565	1674	1669	1669
query3	6753	224	219	219
query4	26587	23097	22870	22870
query5	5039	629	462	462
query6	334	225	219	219
query7	4650	490	296	296
query8	318	256	238	238
query9	8732	2561	2562	2561
query10	513	354	286	286
query11	15374	15325	14979	14979
query12	196	117	112	112
query13	1685	588	463	463
query14	11560	9369	9223	9223
query15	240	187	177	177
query16	7764	673	509	509
query17	1633	804	633	633
query18	2045	434	335	335
query19	271	211	181	181
query20	132	127	120	120
query21	220	136	117	117
query22	4056	4117	4112	4112
query23	34009	33210	32925	32925
query24	8412	2440	2430	2430
query25	620	550	484	484
query26	1229	274	176	176
query27	2685	505	373	373
query28	4345	2220	2188	2188
query29	816	655	525	525
query30	303	227	198	198
query31	925	818	726	726
query32	88	78	76	76
query33	600	386	399	386
query34	785	836	518	518
query35	803	848	744	744
query36	919	974	909	909
query37	120	108	88	88
query38	3584	3528	3476	3476
query39	1480	1396	1486	1396
query40	225	128	117	117
query41	66	61	62	61
query42	129	113	109	109
query43	476	495	477	477
query44	1231	762	776	762
query45	198	172	166	166
query46	889	984	640	640
query47	1733	1764	1680	1680
query48	393	420	347	347
query49	771	487	407	407
query50	643	670	419	419
query51	4013	3986	3961	3961
query52	111	110	102	102
query53	242	269	192	192
query54	299	291	266	266
query55	86	85	81	81
query56	337	304	304	304
query57	1217	1210	1114	1114
query58	290	271	266	266
query59	2466	2561	2527	2527
query60	359	336	316	316
query61	204	159	159	159
query62	766	721	675	675
query63	226	202	197	197
query64	4419	1156	867	867
query65	4067	3920	3942	3920
query66	1086	450	324	324
query67	15288	15122	14906	14906
query68	8395	896	632	632
query69	489	322	279	279
query70	1358	1244	1249	1244
query71	510	340	309	309
query72	5974	4930	4847	4847
query73	717	592	362	362
query74	8851	8872	8627	8627
query75	3986	3301	2780	2780
query76	3851	1152	751	751
query77	813	387	330	330
query78	9503	9568	8872	8872
query79	2491	816	603	603
query80	662	565	496	496
query81	521	261	226	226
query82	447	155	128	128
query83	290	265	250	250
query84	301	116	100	100
query85	900	481	452	452
query86	390	317	288	288
query87	3727	3721	3547	3547
query88	3859	2234	2246	2234
query89	387	321	292	292
query90	1929	224	222	222
query91	164	166	135	135
query92	87	64	69	64
query93	2013	986	674	674
query94	697	459	347	347
query95	403	325	316	316
query96	472	576	277	277
query97	2951	2977	2850	2850
query98	248	213	212	212
query99	1411	1377	1288	1288
Total cold run time: 278472 ms
Total hot run time: 187181 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.48 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ef1f5d2b10dd182d10bdaeb28b824376cb4f6c29, data reload: false

query1	0.05	0.05	0.04
query2	0.10	0.05	0.05
query3	0.27	0.08	0.08
query4	1.60	0.11	0.12
query5	0.28	0.25	0.25
query6	1.15	0.64	0.64
query7	0.03	0.02	0.03
query8	0.05	0.05	0.04
query9	0.59	0.52	0.51
query10	0.57	0.57	0.58
query11	0.17	0.11	0.12
query12	0.15	0.12	0.13
query13	0.62	0.59	0.61
query14	1.02	1.00	1.01
query15	0.85	0.84	0.84
query16	0.40	0.39	0.39
query17	1.01	1.02	1.05
query18	0.22	0.20	0.20
query19	1.87	1.78	1.85
query20	0.02	0.01	0.01
query21	15.46	0.19	0.12
query22	5.05	0.08	0.05
query23	15.64	0.26	0.10
query24	2.79	0.81	0.40
query25	0.07	0.07	0.06
query26	0.14	0.15	0.13
query27	0.08	0.05	0.05
query28	4.19	1.17	0.93
query29	12.54	3.94	3.23
query30	0.29	0.14	0.12
query31	2.81	0.60	0.38
query32	3.22	0.56	0.49
query33	3.08	3.12	3.10
query34	15.87	5.18	4.53
query35	4.55	4.57	4.58
query36	0.66	0.50	0.49
query37	0.10	0.07	0.07
query38	0.06	0.05	0.04
query39	0.04	0.03	0.02
query40	0.18	0.14	0.14
query41	0.08	0.03	0.04
query42	0.05	0.03	0.03
query43	0.04	0.03	0.03
Total cold run time: 98.01 s
Total hot run time: 27.48 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 34.00% (137/403) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 37.22% (150/403) 🎉
Increment coverage report
Complete coverage report

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 26, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit e3a79e4 into apache:master Nov 26, 2025
27 of 28 checks passed
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Nov 30, 2025
…apache#57972)

This PR implements support for partition evolution in Iceberg tables,
allowing users to dynamically modify table partition strategies without
rewriting data files. This is a metadata-only operation that maintains
multiple partition spec versions.

Apache Iceberg supports partition evolution, which enables changing
partition strategies on existing tables without data migration. Doris,
as a query engine for Iceberg, needs to support SQL syntax for partition
evolution operations to provide users with flexible partition
management.

- **Add Partition Field**: Add new partition field to existing partition
specifications
- **Drop Partition Field**: Remove partition field from existing
partition specifications
- **Replace Partition Field**:Replace partition field from existing
partition specifications with new partition field

1. **Metadata-only operation**: Partition evolution only updates table
metadata, no data files are rewritten
2. **Backward compatibility**: Historical data retains original
partition specs, new data uses new partition specs
3. **Syntax compatibility**: Follows Spark SQL ALTER TABLE syntax for
consistency

```sql
-- use optional AS keyword to specify a custom name for the partition field
ALTER TABLE table_name ADD PARTITION KEY partition_transform [AS key_name];

-- example
ALTER TABLE prod.db.sample ADD PARTITION KEY bucket(16, id);
ALTER TABLE prod.db.sample ADD PARTITION KEY truncate(4, data);
ALTER TABLE prod.db.sample ADD PARTITION KEY year(ts);
-- use optional AS keyword to specify a custom name for the partition field
ALTER TABLE prod.db.sample ADD PARTITION KEY bucket(16, id) AS shard;
```

```sql
ALTER TABLE table_name DROP PARTITION KEY partition_transform|key_name;

-- example
ALTER TABLE prod.db.sample DROP PARTITION KEY catalog;
ALTER TABLE prod.db.sample DROP PARTITION KEY bucket(16, id);
ALTER TABLE prod.db.sample DROP PARTITION KEY truncate(4, data);
ALTER TABLE prod.db.sample DROP PARTITION KEY year(ts);
ALTER TABLE prod.db.sample DROP PARTITION KEY shard;
```

```sql
-- use optional AS keyword to specify a custom name for the partition field
ALTER TABLE table_name REPLACE PARTITION KEY key_name WITH partition_transform [AS key_name];

-- example
ALTER TABLE prod.db.sample REPLACE PARTITION KEY ts_day WITH day(ts);
-- use optional AS keyword to specify a custom name for the new partition field
ALTER TABLE prod.db.sample REPLACE PARTITION KEY ts_day WITH day(ts) AS day_of_ts;
```

| Transform | Syntax | Example |
|-----------|--------|---------|
| bucket | `bucket(N, column)` | `bucket(16, id)` |
| truncate | `truncate(N, column)` | `truncate(10, name)` |
| year | `year(column)` | `year(ts)` |
| month | `month(column)` | `month(ts)` |
| day | `day(column)` | `day(ts)` |
| hour | `hour(column)` | `hour(ts)` |
| identity | `column` | `category` |
yiguolei pushed a commit that referenced this pull request Dec 2, 2025
@yiguolei yiguolei mentioned this pull request Dec 2, 2025
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…apache#57972)

### What problem does this PR solve?

## Summary

This PR implements support for partition evolution in Iceberg tables,
allowing users to dynamically modify table partition strategies without
rewriting data files. This is a metadata-only operation that maintains
multiple partition spec versions.

## Background

Apache Iceberg supports partition evolution, which enables changing
partition strategies on existing tables without data migration. Doris,
as a query engine for Iceberg, needs to support SQL syntax for partition
evolution operations to provide users with flexible partition
management.

## Features

### Core Functionality

- **Add Partition Field**: Add new partition field to existing partition
specifications
- **Drop Partition Field**: Remove partition field from existing
partition specifications
- **Replace Partition Field**:Replace partition field from existing
partition specifications with new partition field

### Design Principles

1. **Metadata-only operation**: Partition evolution only updates table
metadata, no data files are rewritten
2. **Backward compatibility**: Historical data retains original
partition specs, new data uses new partition specs
3. **Syntax compatibility**: Follows Spark SQL ALTER TABLE syntax for
consistency

## Syntax

### Add Partition Field

```sql
-- use optional AS keyword to specify a custom name for the partition field
ALTER TABLE table_name ADD PARTITION KEY partition_transform [AS key_name];

-- example
ALTER TABLE prod.db.sample ADD PARTITION KEY bucket(16, id);
ALTER TABLE prod.db.sample ADD PARTITION KEY truncate(4, data);
ALTER TABLE prod.db.sample ADD PARTITION KEY year(ts);
-- use optional AS keyword to specify a custom name for the partition field 
ALTER TABLE prod.db.sample ADD PARTITION KEY bucket(16, id) AS shard;
```

### Drop Partition Field

```sql
ALTER TABLE table_name DROP PARTITION KEY partition_transform|key_name;

-- example
ALTER TABLE prod.db.sample DROP PARTITION KEY catalog;
ALTER TABLE prod.db.sample DROP PARTITION KEY bucket(16, id);
ALTER TABLE prod.db.sample DROP PARTITION KEY truncate(4, data);
ALTER TABLE prod.db.sample DROP PARTITION KEY year(ts);
ALTER TABLE prod.db.sample DROP PARTITION KEY shard;
```

### Replace Partition Field

```sql
-- use optional AS keyword to specify a custom name for the partition field
ALTER TABLE table_name REPLACE PARTITION KEY key_name WITH partition_transform [AS key_name];

-- example
ALTER TABLE prod.db.sample REPLACE PARTITION KEY ts_day WITH day(ts);
-- use optional AS keyword to specify a custom name for the new partition field 
ALTER TABLE prod.db.sample REPLACE PARTITION KEY ts_day WITH day(ts) AS day_of_ts;
```

### Supported Partition Transforms

| Transform | Syntax | Example |
|-----------|--------|---------|
| bucket | `bucket(N, column)` | `bucket(16, id)` |
| truncate | `truncate(N, column)` | `truncate(10, name)` |
| year | `year(column)` | `year(ts)` |
| month | `month(column)` | `month(ts)` |
| day | `day(column)` | `day(ts)` |
| hour | `hour(column)` | `hour(ts)` |
| identity | `column` | `category` |
@suxiaogang223 suxiaogang223 deleted the iceberg_partition_evolution_ddl branch January 17, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants