Skip to content

branch-3.0[fix](variant)fix top-array in variant#54393

Merged
dataroaring merged 5 commits intoapache:branch-3.0from
amorynan:fix-top-array-in-variant
Aug 18, 2025
Merged

branch-3.0[fix](variant)fix top-array in variant#54393
dataroaring merged 5 commits intoapache:branch-3.0from
amorynan:fix-top-array-in-variant

Conversation

@amorynan
Copy link
Contributor

@amorynan amorynan commented Aug 6, 2025

What problem does this PR solve?

backport : #54396
This pr mainly solves the problem when we insert top-level nested array data into variant columnIssue Number:
like :

mysql> insert into sv1 values (1, '[{"a": 1, "c": 1.1}, {"b": "1"}]');

we maintain the association information between Nested in array , in this case , we will keep same offset at a,b,c ,

mysql> select * from sv1;
+------+----------------------------------------------+
| k    | v                                            |
+------+----------------------------------------------+
|    1 | {"a":[1,null],"b":[null,"1"],"c":[1.1,null]} |
+------+----------------------------------------------+

but in top array we just do flatten array not fill the null value to maintain the association information between Nested in array
like:

mysql> insert into sv1 values (16, '[{"a": 1, "b": 1}, {"b": 2}, {"b": 3}]');
Query OK, 1 row affected (0.15 sec)
{'label':'label_4b3ede8b449a4d9a_b55fdeebe13b741c', 'status':'VISIBLE', 'txnId':'6031'}

mysql> select v from sv1 where k=16;
+-----------------------+
| v                     |
+-----------------------+
| {"a":[1],"b":[1,2,3]} |
+-----------------------+
1 row in set (0.19 sec)

close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@amorynan amorynan requested a review from dataroaring as a code owner August 6, 2025 07:34
@Thearas
Copy link
Contributor

Thearas commented Aug 6, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@eldenmoon
Copy link
Member

run buildall

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 35.71% (30/84) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 41.90% (11139/26586)
Line Coverage 32.41% (95392/294293)
Region Coverage 31.56% (49287/156163)
Branch Coverage 28.00% (25265/90224)

@amorynan
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39921 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7377637687b7344b0cec0c66d24e87c3f62218b1, data reload: false

------ Round 1 ----------------------------------
q1	17596	7534	6708	6708
q2	2045	194	177	177
q3	10570	1115	1144	1115
q4	10231	774	740	740
q5	7757	2827	2817	2817
q6	211	132	133	132
q7	966	619	600	600
q8	9342	1948	2011	1948
q9	6684	6452	6445	6445
q10	7019	2267	2312	2267
q11	470	308	272	272
q12	495	215	216	215
q13	17782	2970	2962	2962
q14	246	210	206	206
q15	514	463	467	463
q16	486	371	377	371
q17	973	583	600	583
q18	7217	6637	6826	6637
q19	1516	1093	1039	1039
q20	471	200	207	200
q21	3989	3053	3103	3053
q22	1098	971	982	971
Total cold run time: 107678 ms
Total hot run time: 39921 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6615	6557	6572	6557
q2	340	231	243	231
q3	2955	2937	2968	2937
q4	2084	1836	1749	1749
q5	5802	5693	5764	5693
q6	211	127	128	127
q7	2241	1888	1819	1819
q8	3457	3682	3545	3545
q9	9077	9056	8843	8843
q10	3572	3568	3481	3481
q11	595	495	503	495
q12	795	618	646	618
q13	7612	3150	3160	3150
q14	292	284	274	274
q15	512	464	453	453
q16	506	463	446	446
q17	1861	1637	1657	1637
q18	8305	7688	7591	7591
q19	1662	1550	1534	1534
q20	2143	1898	1875	1875
q21	5188	5093	5096	5093
q22	1164	1039	1019	1019
Total cold run time: 66989 ms
Total hot run time: 59167 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 79.76% (67/84) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 42.07% (11237/26709)
Line Coverage 32.59% (96186/295101)
Region Coverage 30.53% (55224/180884)
Branch Coverage 26.85% (27332/101782)

@doris-robot
Copy link

TPC-DS: Total hot run time: 192456 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7377637687b7344b0cec0c66d24e87c3f62218b1, data reload: false

query1	946	404	417	404
query2	6322	2015	1963	1963
query3	8689	203	202	202
query4	33431	23850	23576	23576
query5	3806	469	460	460
query6	296	188	188	188
query7	4213	316	334	316
query8	293	230	230	230
query9	9274	2601	2608	2601
query10	494	270	260	260
query11	18623	15276	15565	15276
query12	155	106	101	101
query13	1547	417	407	407
query14	9049	6449	6509	6449
query15	238	170	177	170
query16	7917	477	499	477
query17	1568	621	577	577
query18	2097	312	327	312
query19	220	160	151	151
query20	116	116	119	116
query21	199	108	103	103
query22	4714	4649	4688	4649
query23	35036	34005	34016	34005
query24	12833	2898	2980	2898
query25	693	398	403	398
query26	1803	166	162	162
query27	3379	342	344	342
query28	7635	2207	2186	2186
query29	1057	441	435	435
query30	272	163	156	156
query31	1065	829	841	829
query32	96	56	57	56
query33	817	300	314	300
query34	1327	497	527	497
query35	900	759	730	730
query36	1134	944	968	944
query37	196	72	68	68
query38	4096	3989	4035	3989
query39	1526	1495	1465	1465
query40	262	106	107	106
query41	58	52	50	50
query42	118	108	108	108
query43	546	504	499	499
query44	1307	823	815	815
query45	196	177	170	170
query46	1181	753	753	753
query47	2062	1953	1978	1953
query48	505	404	412	404
query49	1191	414	421	414
query50	837	436	429	429
query51	7486	7268	7249	7249
query52	104	94	92	92
query53	274	193	193	193
query54	1220	494	481	481
query55	82	84	79	79
query56	297	280	264	264
query57	1351	1199	1224	1199
query58	239	217	240	217
query59	3316	3074	2946	2946
query60	300	278	274	274
query61	147	130	128	128
query62	863	707	675	675
query63	217	189	187	187
query64	5107	739	625	625
query65	3276	3187	3224	3187
query66	1368	298	304	298
query67	15868	15554	15494	15494
query68	4738	591	595	591
query69	449	272	260	260
query70	1214	1109	1128	1109
query71	355	257	251	251
query72	6239	4178	4005	4005
query73	747	346	352	346
query74	10770	9209	9240	9209
query75	3401	2683	2634	2634
query76	2824	1045	1080	1045
query77	385	267	269	267
query78	10681	9565	9571	9565
query79	1173	580	600	580
query80	800	433	430	430
query81	515	219	215	215
query82	666	90	87	87
query83	242	147	141	141
query84	236	83	82	82
query85	1130	309	291	291
query86	331	311	295	295
query87	4462	4253	4248	4248
query88	3621	2417	2366	2366
query89	412	292	290	290
query90	2114	188	180	180
query91	178	149	172	149
query92	56	52	52	52
query93	1080	553	553	553
query94	781	295	308	295
query95	358	262	256	256
query96	604	279	288	279
query97	3315	3194	3143	3143
query98	231	203	203	203
query99	1535	1318	1302	1302
Total cold run time: 303493 ms
Total hot run time: 192456 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.1 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7377637687b7344b0cec0c66d24e87c3f62218b1, data reload: false

query1	0.03	0.03	0.04
query2	0.07	0.03	0.03
query3	0.24	0.07	0.07
query4	1.61	0.11	0.11
query5	0.50	0.50	0.50
query6	1.13	0.73	0.72
query7	0.04	0.02	0.02
query8	0.04	0.03	0.03
query9	0.55	0.50	0.52
query10	0.57	0.56	0.56
query11	0.15	0.11	0.11
query12	0.15	0.11	0.11
query13	0.61	0.60	0.60
query14	0.76	0.83	0.79
query15	0.85	0.83	0.84
query16	0.39	0.41	0.38
query17	0.99	1.06	1.00
query18	0.24	0.23	0.22
query19	1.95	1.82	1.80
query20	0.01	0.01	0.01
query21	15.39	0.58	0.58
query22	2.24	1.97	1.73
query23	16.84	0.94	0.87
query24	3.33	1.65	0.71
query25	0.16	0.18	0.08
query26	0.46	0.15	0.14
query27	0.04	0.04	0.03
query28	9.91	0.55	0.47
query29	12.60	3.25	3.25
query30	0.25	0.07	0.06
query31	2.85	0.38	0.38
query32	3.23	0.46	0.45
query33	2.99	3.01	3.10
query34	16.92	4.53	4.56
query35	4.57	4.59	4.59
query36	0.66	0.50	0.50
query37	0.08	0.06	0.06
query38	0.06	0.04	0.04
query39	0.04	0.02	0.02
query40	0.15	0.12	0.12
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.02	0.02
Total cold run time: 103.81 s
Total hot run time: 30.1 s

@amorynan
Copy link
Contributor Author

run cloud_p0

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 82.14% (69/84) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.98% (19683/26250)
Line Coverage 68.21% (200749/294317)
Region Coverage 66.36% (120217/181152)
Branch Coverage 59.68% (60980/102172)

@amorynan amorynan changed the title [fix](variant)fix top-array in variant branch-3.0[fix](variant)fix top-array in variant Aug 18, 2025
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 18, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@dataroaring dataroaring merged commit fc80989 into apache:branch-3.0 Aug 18, 2025
23 of 26 checks passed
@gavinchou gavinchou mentioned this pull request Sep 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments