Skip to content

branch-3.1: [opt](inverted index) create non analyzer when parser is none for inverted index #54666#54786

Merged
morrySnow merged 2 commits intoapache:branch-3.1from
airborne12:pick_54666_to_origin_branch-3.1
Aug 15, 2025
Merged

branch-3.1: [opt](inverted index) create non analyzer when parser is none for inverted index #54666#54786
morrySnow merged 2 commits intoapache:branch-3.1from
airborne12:pick_54666_to_origin_branch-3.1

Conversation

@airborne12
Copy link
Member

cherry pick from #54666

…erted index (apache#54666)

Issue Number: close #xxx

Related PR: apache#54619

Problem Summary:
When no parser is specified, the inverted index writer currently creates
a default analyzer (simple analyzer), which can cause unnecessary
performance overhead. This PR addresses this by setting the analyzer to
nullptr to avoid the overhead.
Note: This PR should be merged together with or after apache#54619.
@airborne12
Copy link
Member Author

run buildall

@airborne12 airborne12 requested a review from morrySnow as a code owner August 14, 2025 10:15
@Thearas
Copy link
Contributor

Thearas commented Aug 14, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33695 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e3bd016b8b714c698e2ea70dd33e22c9895f8e78, data reload: false

------ Round 1 ----------------------------------
q1	17801	5716	5729	5716
q2	2037	409	284	284
q3	12316	1345	771	771
q4	10229	922	465	465
q5	8368	2537	2301	2301
q6	202	172	134	134
q7	965	757	636	636
q8	9342	1600	1218	1218
q9	5494	5259	5161	5161
q10	6856	2334	1843	1843
q11	488	286	262	262
q12	354	389	215	215
q13	17787	3693	3008	3008
q14	229	226	213	213
q15	538	481	467	467
q16	447	435	391	391
q17	604	911	381	381
q18	7160	6426	6502	6426
q19	1532	1082	569	569
q20	339	339	204	204
q21	3099	2356	2046	2046
q22	1098	1062	984	984
Total cold run time: 107285 ms
Total hot run time: 33695 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6160	5902	5866	5866
q2	257	341	231	231
q3	2328	2715	2316	2316
q4	1444	1864	1411	1411
q5	4654	5155	5183	5155
q6	206	178	127	127
q7	2182	1974	1827	1827
q8	2883	3043	2990	2990
q9	7306	7266	7280	7266
q10	3029	3290	2835	2835
q11	613	528	501	501
q12	670	791	609	609
q13	3506	3920	3199	3199
q14	282	310	277	277
q15	522	471	485	471
q16	443	495	464	464
q17	1344	1911	1296	1296
q18	7745	7482	7313	7313
q19	858	1203	1314	1203
q20	2075	2054	1912	1912
q21	5595	5162	4644	4644
q22	1094	1114	1024	1024
Total cold run time: 55196 ms
Total hot run time: 52937 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192704 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e3bd016b8b714c698e2ea70dd33e22c9895f8e78, data reload: false

query1	961	412	424	412
query2	6282	1971	1920	1920
query3	8686	198	194	194
query4	33589	23914	23872	23872
query5	3878	625	446	446
query6	286	187	174	174
query7	4191	526	320	320
query8	314	262	253	253
query9	9610	2632	2602	2602
query10	456	331	265	265
query11	18295	15408	15148	15148
query12	173	109	108	108
query13	1546	561	414	414
query14	10389	7557	6663	6663
query15	276	200	182	182
query16	8117	698	518	518
query17	1620	815	591	591
query18	2151	442	338	338
query19	242	194	172	172
query20	129	118	128	118
query21	210	135	107	107
query22	4485	4586	4352	4352
query23	35089	34083	34694	34083
query24	7662	2747	2752	2747
query25	559	522	439	439
query26	1357	310	172	172
query27	2304	520	381	381
query28	5516	2224	2181	2181
query29	788	645	475	475
query30	255	199	170	170
query31	1069	962	831	831
query32	93	65	61	61
query33	508	378	296	296
query34	777	906	533	533
query35	834	850	748	748
query36	1068	1095	954	954
query37	110	97	77	77
query38	4148	4087	3956	3956
query39	1522	1477	1468	1468
query40	211	125	123	123
query41	54	52	50	50
query42	122	106	100	100
query43	533	540	507	507
query44	1388	813	820	813
query45	190	180	183	180
query46	960	1125	689	689
query47	1970	1990	1954	1954
query48	397	421	349	349
query49	783	520	418	418
query50	720	751	434	434
query51	7246	7245	7173	7173
query52	104	105	98	98
query53	238	291	196	196
query54	571	566	474	474
query55	81	79	81	79
query56	271	273	269	269
query57	1310	1324	1223	1223
query58	242	227	220	220
query59	3090	3249	3048	3048
query60	311	292	275	275
query61	116	116	110	110
query62	827	813	669	669
query63	246	199	188	188
query64	4615	1072	627	627
query65	3467	3377	3328	3328
query66	933	445	315	315
query67	16501	15754	15535	15535
query68	7414	852	532	532
query69	502	333	265	265
query70	1275	1124	1148	1124
query71	384	302	268	268
query72	5802	3916	3941	3916
query73	648	787	351	351
query74	10435	9191	8964	8964
query75	3357	3307	2675	2675
query76	3151	1305	869	869
query77	557	376	276	276
query78	10496	10522	9710	9710
query79	3696	869	590	590
query80	711	541	425	425
query81	528	266	223	223
query82	597	126	94	94
query83	165	159	149	149
query84	250	105	86	86
query85	777	372	291	291
query86	386	326	302	302
query87	4388	4385	4276	4276
query88	5182	2490	2403	2403
query89	443	342	294	294
query90	1817	197	198	197
query91	142	149	111	111
query92	65	60	52	52
query93	2175	866	536	536
query94	684	446	271	271
query95	343	281	277	277
query96	494	684	281	281
query97	3281	3340	3179	3179
query98	247	206	209	206
query99	1677	1478	1326	1326
Total cold run time: 299111 ms
Total hot run time: 192704 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.04 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e3bd016b8b714c698e2ea70dd33e22c9895f8e78, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.03	0.03
query3	0.23	0.07	0.07
query4	1.62	0.11	0.11
query5	0.54	0.52	0.50
query6	1.13	0.72	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.58	0.52	0.50
query10	0.55	0.55	0.56
query11	0.14	0.10	0.13
query12	0.14	0.10	0.11
query13	0.62	0.60	0.59
query14	0.80	0.81	0.81
query15	0.86	0.84	0.83
query16	0.39	0.41	0.38
query17	1.02	1.04	1.03
query18	0.25	0.23	0.22
query19	1.98	1.86	1.89
query20	0.01	0.01	0.01
query21	15.38	0.92	0.59
query22	0.73	0.75	0.70
query23	15.13	1.46	0.53
query24	3.04	0.90	2.12
query25	0.26	0.08	0.21
query26	0.29	0.15	0.13
query27	0.05	0.05	0.05
query28	13.90	1.03	0.45
query29	12.56	3.99	3.30
query30	0.25	0.09	0.07
query31	2.82	0.62	0.40
query32	3.23	0.55	0.47
query33	3.07	3.10	3.06
query34	16.65	5.23	4.50
query35	4.57	4.54	4.60
query36	0.66	0.48	0.48
query37	0.09	0.06	0.07
query38	0.05	0.04	0.04
query39	0.04	0.02	0.03
query40	0.18	0.13	0.12
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 104.11 s
Total hot run time: 29.04 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.46% (12713/27964)
Line Coverage 36.35% (113340/311783)
Region Coverage 33.98% (64845/190853)
Branch Coverage 31.00% (34017/109720)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 75.83% (20886/27542)
Line Coverage 69.11% (215121/311262)
Region Coverage 67.07% (128607/191752)
Branch Coverage 60.70% (66938/110284)

@morrySnow morrySnow merged commit c343c9c into apache:branch-3.1 Aug 15, 2025
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments