[Enhancement](parquet)update runtime filter when read next parquet row group.#59053

hubgeter · 2025-12-15T14:27:25Z

What problem does this PR solve?

Problem Summary:
This pull request achieves better filtering by fetching the latest join runtime filter when creating the Parquet row group reader. Previously, the join runtime filter was fetched at the Parquet file level.

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

Thearas · 2025-12-15T14:27:36Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

hubgeter · 2025-12-15T14:27:36Z

run buildall

github-actions · 2025-12-16T07:20:19Z

Possible file(s) that should be tracked in LFS detected: 🚨

The following file(s) exceeds the file size limit: 1048576 bytes, as set in the .yml configuration files:

docker/thirdparties/docker-compose/hive/scripts/preinstalled_data/parquet_table/runtime_filter_fact_big/fact_big.parquet

Consider using git-lfs to manage large files.

hubgeter · 2025-12-16T07:42:53Z

run buildall

doris-robot · 2025-12-16T08:45:14Z

TPC-H: Total hot run time: 32192 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 71cefb9ff1c6addc831236aaa8eb0909f83e8b4f, data reload: false

------ Round 1 ----------------------------------
q1	17624	4275	4053	4053
q2	2029	361	242	242
q3	10162	1306	750	750
q4	10204	820	309	309
q5	7480	2141	1917	1917
q6	181	172	136	136
q7	1013	863	709	709
q8	9350	1430	1176	1176
q9	5055	4894	4704	4704
q10	6855	2399	1958	1958
q11	527	328	297	297
q12	652	727	597	597
q13	17775	3713	3032	3032
q14	285	301	270	270
q15	597	529	521	521
q16	684	668	631	631
q17	678	826	483	483
q18	6901	6515	6278	6278
q19	1106	946	592	592
q20	400	372	253	253
q21	3143	2511	2325	2325
q22	1062	1020	959	959
Total cold run time: 103763 ms
Total hot run time: 32192 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4095	3998	4012	3998
q2	323	411	314	314
q3	2160	2694	2272	2272
q4	1312	1745	1307	1307
q5	4217	4141	4257	4141
q6	210	168	125	125
q7	1903	2068	1941	1941
q8	2717	2578	2569	2569
q9	7667	7524	7529	7524
q10	3133	3231	2846	2846
q11	588	526	493	493
q12	702	746	632	632
q13	3554	3926	3374	3374
q14	295	311	268	268
q15	560	536	499	499
q16	652	720	630	630
q17	1272	1438	1436	1436
q18	7980	7664	7820	7664
q19	883	846	860	846
q20	2075	2018	2005	2005
q21	4937	4323	4118	4118
q22	1038	1039	960	960
Total cold run time: 52273 ms
Total hot run time: 49962 ms

doris-robot · 2025-12-16T08:56:17Z

TPC-DS: Total hot run time: 176509 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 71cefb9ff1c6addc831236aaa8eb0909f83e8b4f, data reload: false

query5	4373	595	435	435
query6	353	234	223	223
query7	4223	459	275	275
query8	304	258	248	248
query9	8787	2545	2522	2522
query10	505	369	342	342
query11	15575	14946	14659	14659
query12	164	112	112	112
query13	1248	491	393	393
query14	5443	2975	2760	2760
query14_1	2639	2655	2637	2637
query15	210	199	180	180
query16	894	460	435	435
query17	1038	728	616	616
query18	2431	444	353	353
query19	241	241	212	212
query20	121	119	112	112
query21	221	137	117	117
query22	3933	3993	3902	3902
query23	16627	16238	15815	15815
query23_1	16134	16128	16102	16102
query24	7352	1654	1249	1249
query24_1	1252	1244	1248	1244
query25	584	498	454	454
query26	1253	276	168	168
query27	2759	461	314	314
query28	4491	2165	2148	2148
query29	839	565	472	472
query30	319	244	223	223
query31	833	701	638	638
query32	83	73	72	72
query33	560	354	304	304
query34	932	920	564	564
query35	782	834	748	748
query36	876	904	826	826
query37	136	99	81	81
query38	2863	2854	2847	2847
query39	767	760	727	727
query39_1	796	715	714	714
query40	229	143	124	124
query41	71	68	67	67
query42	113	102	105	102
query43	433	443	404	404
query44	1366	753	753	753
query45	201	196	185	185
query46	945	979	618	618
query47	1660	1685	1596	1596
query48	331	338	254	254
query49	650	447	365	365
query50	676	315	229	229
query51	3856	3909	3796	3796
query52	110	118	103	103
query53	335	351	297	297
query54	296	276	303	276
query55	77	74	74	74
query56	287	299	287	287
query57	1148	1145	1087	1087
query58	269	246	256	246
query59	2425	2555	2386	2386
query60	303	300	278	278
query61	165	159	159	159
query62	709	685	622	622
query63	328	298	312	298
query64	5003	1278	973	973
query65	4011	3942	3944	3942
query66	1467	467	319	319
query67	15125	15067	14755	14755
query68	8355	1009	721	721
query69	505	342	311	311
query70	1129	967	1009	967
query71	398	311	275	275
query72	5766	3564	3546	3546
query73	765	713	315	315
query74	8825	8848	8639	8639
query75	3183	3089	2744	2744
query76	3930	1120	739	739
query77	653	400	289	289
query78	9442	9648	8799	8799
query79	1418	876	617	617
query80	683	660	561	561
query81	526	266	238	238
query82	207	139	104	104
query83	265	263	242	242
query84	293	118	104	104
query85	900	499	468	468
query86	388	306	270	270
query87	2998	3115	2973	2973
query88	3590	2281	2276	2276
query89	468	414	409	409
query90	2248	170	160	160
query91	179	164	143	143
query92	82	70	64	64
query93	1668	907	565	565
query94	478	307	266	266
query95	571	384	305	305
query96	599	490	206	206
query97	2269	2302	2220	2220
query98	216	198	194	194
query99	1294	1294	1236	1236
Total cold run time: 259884 ms
Total hot run time: 176509 ms

doris-robot · 2025-12-16T09:01:18Z

ClickBench: Total hot run time: 27.14 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 71cefb9ff1c6addc831236aaa8eb0909f83e8b4f, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.06	0.05
query3	0.25	0.09	0.09
query4	1.61	0.12	0.11
query5	0.26	0.25	0.26
query6	1.15	0.64	0.65
query7	0.02	0.02	0.02
query8	0.05	0.04	0.04
query9	0.56	0.50	0.51
query10	0.55	0.56	0.56
query11	0.16	0.11	0.11
query12	0.15	0.12	0.11
query13	0.62	0.60	0.60
query14	0.99	0.99	0.98
query15	0.81	0.80	0.81
query16	0.39	0.41	0.39
query17	1.00	1.00	1.03
query18	0.22	0.22	0.21
query19	1.88	1.85	1.81
query20	0.02	0.01	0.01
query21	15.43	0.28	0.14
query22	4.83	0.05	0.05
query23	16.05	0.28	0.10
query24	1.50	0.31	0.25
query25	0.05	0.05	0.06
query26	0.14	0.14	0.14
query27	0.06	0.08	0.06
query28	3.28	1.23	1.02
query29	12.61	4.09	3.22
query30	0.28	0.14	0.12
query31	2.81	0.64	0.40
query32	3.24	0.55	0.45
query33	3.00	2.97	3.00
query34	16.83	5.22	4.51
query35	4.58	4.55	4.51
query36	0.67	0.49	0.48
query37	0.11	0.06	0.07
query38	0.07	0.04	0.03
query39	0.04	0.03	0.03
query40	0.16	0.14	0.12
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 96.74 s
Total hot run time: 27.14 s

hello-stephen · 2025-12-16T10:15:59Z

BE UT Coverage Report

Increment line coverage 55.10% (27/49) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	53.42% (18837/35264)
Line Coverage	39.20% (174439/444979)
Region Coverage	33.78% (135048/399771)
Branch Coverage	34.68% (58149/167684)

hello-stephen · 2025-12-16T11:53:10Z

BE Regression && UT Coverage Report

Increment line coverage 100.00% (49/49) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	72.22% (24958/34557)
Line Coverage	58.99% (262127/444364)
Region Coverage	53.81% (217587/404398)
Branch Coverage	55.37% (93299/168495)

github-actions · 2025-12-16T14:42:08Z

PR approved by at least one committer and no changes requested.

github-actions · 2025-12-16T14:42:10Z

PR approved by anyone and no changes requested.

Copilot

Pull request overview

This PR enhances Parquet row group filtering by fetching the latest join runtime filters when creating each row group reader, rather than only at the file level. This allows for more efficient data filtering as runtime filters become available during query execution.

Key Changes:

Added runtime filter update mechanism that is called when creating new Parquet row group readers
Introduced test coverage to verify that runtime filters are applied across multiple row groups

Reviewed changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
regression-test/suites/external_table_p0/hive/test_parquet_join_runtime_filter.groovy	Adds comprehensive test suite for verifying runtime filter application across Parquet row groups
docker/thirdparties/docker-compose/hive/scripts/create_preinstalled_scripts/run84.hql	Creates test tables for runtime filter validation
be/src/vec/exec/scan/scanner.cpp	Initializes total runtime filter count during scanner construction
be/src/vec/exec/scan/file_scanner.h	Updates method signature to track runtime filter changes
be/src/vec/exec/scan/file_scanner.cpp	Implements runtime filter update callback for Parquet readers
be/src/vec/exec/format/parquet/vparquet_reader.h	Adds callback mechanism and lazy read context update method
be/src/vec/exec/format/parquet/vparquet_reader.cpp	Implements runtime filter updates when creating row group readers
be/src/vec/exec/format/parquet/vparquet_group_reader.h	Extends LazyReadContext to include partition and missing column information
be/src/runtime_filter/runtime_filter_consumer_helper.h	Exposes runtime filter count accessor

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-16T14:50:33Z

be/src/vec/exec/scan/scanner.cpp

          _output_tuple_desc(_local_state->output_tuple_desc()),
          _output_row_descriptor(_local_state->_parent->output_row_descriptor()),
          _has_prepared(false) {
+    _total_rf_num = cast_set<int>(_local_state->_helper.runtime_filter_nums());


The function name cast_set is ambiguous and doesn't clearly convey its purpose. Consider renaming to something more descriptive like set_int_value or using direct casting syntax like static_cast<int>.

Suggested change

_total_rf_num = cast_set<int>(_local_state->_helper.runtime_filter_nums());

_total_rf_num = static_cast<int>(_local_state->_helper.runtime_filter_nums());

Copilot · 2025-12-16T14:50:33Z

be/src/vec/exec/format/parquet/vparquet_reader.cpp

+    new_lazy_read_ctx.fill_partition_columns = std::move(_lazy_read_ctx.fill_partition_columns);
+    new_lazy_read_ctx.fill_missing_columns = std::move(_lazy_read_ctx.fill_missing_columns);


Moving from _lazy_read_ctx and then assigning back to it on line 416 leaves the source in a moved-from state. Consider copying these values instead of moving, or restructure to avoid the circular dependency.

Suggested change

new_lazy_read_ctx.fill_partition_columns = std::move(_lazy_read_ctx.fill_partition_columns);

new_lazy_read_ctx.fill_missing_columns = std::move(_lazy_read_ctx.fill_missing_columns);

new_lazy_read_ctx.fill_partition_columns = _lazy_read_ctx.fill_partition_columns;

new_lazy_read_ctx.fill_missing_columns = _lazy_read_ctx.fill_missing_columns;

Copilot · 2025-12-16T14:50:34Z

be/src/vec/exec/format/parquet/vparquet_reader.h

+    // when create new row group reader, call this function to get lasted runtime filter conjuncts.
+    std::function<Status(bool*, VExprContextSPtrs&)> _call_late_rf_func = [](bool* changed,
+                                                                             VExprContextSPtrs&) {
+        *changed = false;
+        return Status::OK();
+    };


The default lambda implementation always sets changed to false and returns OK. Consider documenting why this is the default behavior or making it more explicit with a named static function.

Suggested change

// when create new row group reader, call this function to get lasted runtime filter conjuncts.

std::function<Status(bool*, VExprContextSPtrs&)> _call_late_rf_func = [](bool* changed,

VExprContextSPtrs&) {

*changed = false;

return Status::OK();

};

// when creating a new row group reader, call this function to get the latest runtime filter conjuncts.

// The default implementation does nothing, sets 'changed' to false, and returns OK.

// This is used when no late runtime filter is required.

static Status default_late_rf_func(bool* changed, VExprContextSPtrs&) {

*changed = false;

return Status::OK();

}

std::function<Status(bool*, VExprContextSPtrs&)> _call_late_rf_func = default_late_rf_func;

morningman · 2025-12-17T06:18:52Z

run buildall

doris-robot · 2025-12-17T08:17:20Z

TPC-H: Total hot run time: 32772 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 44fa16dc2289d34007c8b5b4940ca65e134d70b2, data reload: false

------ Round 1 ----------------------------------
q1	17651	4284	4089	4089
q2	2020	359	245	245
q3	10163	1345	778	778
q4	10255	910	319	319
q5	8201	2173	1909	1909
q6	241	167	136	136
q7	1003	872	730	730
q8	9364	1513	1188	1188
q9	5193	4895	4865	4865
q10	6882	2399	1944	1944
q11	528	326	302	302
q12	679	719	600	600
q13	17825	3689	3052	3052
q14	288	309	269	269
q15	599	519	512	512
q16	719	684	643	643
q17	689	833	504	504
q18	7483	6460	6254	6254
q19	1151	980	620	620
q20	400	373	250	250
q21	3141	2593	2596	2593
q22	1024	983	970	970
Total cold run time: 105499 ms
Total hot run time: 32772 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4141	4292	4067	4067
q2	327	418	326	326
q3	2164	2673	2281	2281
q4	1336	1781	1264	1264
q5	4262	4141	4182	4141
q6	211	169	128	128
q7	2426	2047	1827	1827
q8	2733	2496	2609	2496
q9	7567	7479	7554	7479
q10	3153	3218	2908	2908
q11	631	546	503	503
q12	665	761	615	615
q13	3609	4068	3450	3450
q14	288	289	303	289
q15	562	512	508	508
q16	643	665	662	662
q17	1299	1479	1423	1423
q18	7987	7734	7535	7535
q19	948	902	911	902
q20	2031	2079	1930	1930
q21	4945	4294	4141	4141
q22	1055	1004	947	947
Total cold run time: 52983 ms
Total hot run time: 49822 ms

doris-robot · 2025-12-17T08:28:21Z

TPC-DS: Total hot run time: 176928 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 44fa16dc2289d34007c8b5b4940ca65e134d70b2, data reload: false

query5	4533	601	445	445
query6	336	234	220	220
query7	4221	465	281	281
query8	310	250	226	226
query9	8769	2543	2551	2543
query10	520	400	335	335
query11	15262	14814	14598	14598
query12	190	122	118	118
query13	1253	495	385	385
query14	5799	3019	2825	2825
query14_1	2720	2641	2666	2641
query15	231	208	182	182
query16	878	475	453	453
query17	1160	731	615	615
query18	2446	455	406	406
query19	243	237	213	213
query20	127	117	115	115
query21	220	148	118	118
query22	3891	4101	3978	3978
query23	16652	16064	15991	15991
query23_1	15995	16111	15941	15941
query24	7381	1661	1243	1243
query24_1	1251	1238	1270	1238
query25	610	540	453	453
query26	1261	276	188	188
query27	2726	474	322	322
query28	4462	2178	2118	2118
query29	803	530	446	446
query30	315	244	218	218
query31	818	677	604	604
query32	82	70	74	70
query33	545	362	298	298
query34	925	895	538	538
query35	788	819	735	735
query36	877	889	801	801
query37	135	96	75	75
query38	2860	2899	2855	2855
query39	773	731	719	719
query39_1	706	714	704	704
query40	233	144	124	124
query41	71	64	64	64
query42	112	115	107	107
query43	424	452	408	408
query44	1355	753	748	748
query45	200	186	188	186
query46	891	977	627	627
query47	1666	1695	1618	1618
query48	325	350	258	258
query49	626	447	366	366
query50	670	307	228	228
query51	3806	3901	3792	3792
query52	107	119	105	105
query53	325	353	292	292
query54	334	272	256	256
query55	80	77	71	71
query56	314	309	298	298
query57	1142	1133	1072	1072
query58	286	274	254	254
query59	2401	2482	2362	2362
query60	317	311	300	300
query61	170	158	162	158
query62	693	686	624	624
query63	330	300	302	300
query64	5016	1292	1033	1033
query65	4039	3983	3946	3946
query66	1470	470	329	329
query67	15243	15051	14848	14848
query68	5963	1015	733	733
query69	518	347	311	311
query70	1104	989	981	981
query71	363	303	285	285
query72	6221	3562	3371	3371
query73	769	733	321	321
query74	8785	8908	8719	8719
query75	3117	3113	2747	2747
query76	3803	1155	729	729
query77	536	398	293	293
query78	9432	9636	8792	8792
query79	1035	884	634	634
query80	1174	660	576	576
query81	548	273	237	237
query82	415	138	111	111
query83	396	263	244	244
query84	252	129	108	108
query85	945	526	480	480
query86	382	299	280	280
query87	3008	3045	2949	2949
query88	3283	2285	2305	2285
query89	470	433	396	396
query90	2026	177	161	161
query91	182	169	146	146
query92	77	67	65	65
query93	1019	936	564	564
query94	527	303	284	284
query95	581	350	312	312
query96	599	493	211	211
query97	2287	2348	2223	2223
query98	208	200	196	196
query99	1295	1298	1215	1215
Total cold run time: 257274 ms
Total hot run time: 176928 ms

doris-robot · 2025-12-17T08:33:25Z

ClickBench: Total hot run time: 27.49 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 44fa16dc2289d34007c8b5b4940ca65e134d70b2, data reload: false

query1	0.06	0.04	0.04
query2	0.09	0.05	0.04
query3	0.26	0.09	0.09
query4	1.61	0.12	0.11
query5	0.27	0.27	0.25
query6	1.18	0.64	0.64
query7	0.04	0.03	0.02
query8	0.06	0.04	0.04
query9	0.57	0.50	0.51
query10	0.57	0.56	0.57
query11	0.16	0.12	0.11
query12	0.15	0.12	0.11
query13	0.61	0.62	0.60
query14	0.98	0.99	1.00
query15	0.81	0.79	0.82
query16	0.40	0.39	0.41
query17	1.02	1.00	1.03
query18	0.24	0.21	0.22
query19	1.92	1.80	1.79
query20	0.02	0.01	0.01
query21	15.44	0.31	0.14
query22	4.73	0.05	0.06
query23	16.18	0.29	0.10
query24	1.26	0.65	0.46
query25	0.11	0.06	0.14
query26	0.14	0.15	0.13
query27	0.08	0.05	0.05
query28	5.44	1.23	1.03
query29	12.58	4.04	3.23
query30	0.28	0.15	0.11
query31	2.82	0.62	0.40
query32	3.24	0.57	0.47
query33	3.12	3.00	3.03
query34	16.94	5.15	4.59
query35	4.64	4.55	4.53
query36	0.65	0.52	0.49
query37	0.10	0.06	0.07
query38	0.07	0.04	0.04
query39	0.04	0.04	0.04
query40	0.16	0.14	0.13
query41	0.09	0.04	0.03
query42	0.05	0.03	0.03
query43	0.04	0.03	0.03
Total cold run time: 99.22 s
Total hot run time: 27.49 s

hello-stephen · 2025-12-17T09:43:55Z

BE UT Coverage Report

Increment line coverage 55.10% (27/49) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	53.43% (18843/35268)
Line Coverage	39.21% (174467/444968)
Region Coverage	33.77% (135012/399742)
Branch Coverage	34.68% (58152/167679)

hubgeter · 2025-12-17T15:06:50Z

run buildall

hello-stephen · 2025-12-23T06:53:22Z

BE Regression && UT Coverage Report

Increment line coverage 100.00% (48/48) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	72.20% (25010/34641)
Line Coverage	58.95% (262475/445257)
Region Coverage	53.81% (218131/405352)
Branch Coverage	55.29% (93425/168964)

…w group.(#59053) (#59181) bp #59053

morningman

LGTM

github-actions · 2025-12-28T14:54:11Z

PR approved by at least one committer and no changes requested.

hello-stephen · 2025-12-28T15:20:46Z

BE Regression && UT Coverage Report

Increment line coverage 100.00% (44/44) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	72.20% (25010/34641)
Line Coverage	58.95% (262475/445257)
Region Coverage	53.81% (218131/405352)
Branch Coverage	55.29% (93425/168964)

### What problem does this PR solve? Problem Summary: case from pr : #59053

…w group. (apache#59053) Problem Summary: This pull request achieves better filtering by fetching the latest join runtime filter when creating the Parquet row group reader. Previously, the join runtime filter was fetched at the Parquet file level.

### What problem does this PR solve? Problem Summary: case from pr : apache#59053

…t parquet row group. (#59053) (#59725) bp #59053 bp #59557 ### What problem does this PR solve? Problem Summary: This pull request achieves better filtering by fetching the latest join runtime filter when creating the Parquet row group reader. Previously, the join runtime filter was fetched at the Parquet file level.

### What problem does this PR solve? Problem Summary: case from pr : apache#59053

…ax filter (#60197) ### What problem does this PR solve? Problem Summary: 1. Fixed an issue where the Parquet reader's pushdown predicate row group min-max filter was ineffective due to a refactoring of the column predicate #59187. 2. Removed tablet_schema from RuntimePredicate when performing topn runtime filters, making RuntimePredicate not limited to olap scan node. 3. Removed the feature in #59053 where the Parquet reader could obtain runtime filters in real time, for the purpose of unifying the implementation of olap and file scan node logic in the future. 4. Temporarily disable the topn runtime filter for VARBINARY type, as the implementation of this type in column predicate is incomplete, affecting pr #58721.

…w group.(apache#59053) (apache#59181) bp apache#59053

…60474) ### What problem does this PR solve? Related PR: #60197 Problem Summary: This fix Parquet reader lazy materialization invalid issue in PR #60197 caused by the removal of feature #59053.

hubgeter force-pushed the join_rf_parquet_2 branch from b0d62d6 to fcacd4c Compare December 16, 2025 07:19

github-actions bot added the lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request label Dec 16, 2025

github-actions bot removed the lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request label Dec 16, 2025

hubgeter marked this pull request as ready for review December 16, 2025 07:47

morningman added the dev/4.0.x label Dec 16, 2025

morningman previously approved these changes Dec 16, 2025

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 16, 2025

github-actions bot added the reviewed label Dec 16, 2025

morningman requested a review from Copilot December 16, 2025 14:43

Copilot AI reviewed Dec 16, 2025

View reviewed changes

morningman force-pushed the join_rf_parquet_2 branch from 71cefb9 to 44fa16d Compare December 17, 2025 06:18

hubgeter dismissed morningman’s stale review via 2937fe0 December 17, 2025 15:06

hubgeter force-pushed the join_rf_parquet_2 branch from 44fa16d to 2937fe0 Compare December 17, 2025 15:06

morningman pushed a commit that referenced this pull request Dec 25, 2025

[Enhancement](parquet)update runtime filter when read next parquet ro…

f23247f

…w group.(#59053) (#59181) bp #59053

morningman pushed a commit that referenced this pull request Dec 25, 2025

[Enhancement](parquet)update runtime filter when read next parquet ro…

8147ffe

…w group.(#59053) (#59181) bp #59053

morningman pushed a commit that referenced this pull request Dec 26, 2025

[Enhancement](parquet)update runtime filter when read next parquet ro…

9b897fc

…w group.(#59053) (#59181) bp #59053

morningman approved these changes Dec 28, 2025

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 28, 2025

CalvinKirs approved these changes Dec 31, 2025

View reviewed changes

morningman merged commit 9c36839 into apache:master Dec 31, 2025
26 of 28 checks passed

github-actions bot added the dev/4.0.x-conflict label Dec 31, 2025

hubgeter mentioned this pull request Jan 5, 2026

[fix](case)fix unstable parquet join runtime filter case. #59557

Merged

16 tasks

morningman pushed a commit that referenced this pull request Jan 7, 2026

[fix](case)fix unstable parquet join runtime filter case. (#59557)

49becf6

### What problem does this PR solve? Problem Summary: case from pr : #59053

hubgeter mentioned this pull request Jan 9, 2026

branch-4.0: [Enhancement](parquet)update runtime filter when read next parquet row group. (#59053) #59725

Merged

hubgeter added a commit to hubgeter/doris that referenced this pull request Jan 9, 2026

[fix](case)fix unstable parquet join runtime filter case. (apache#59557)

d544c7a

### What problem does this PR solve? Problem Summary: case from pr : apache#59053

yiguolei added dev/4.0.3-merged and removed dev/4.0.x dev/4.0.x-conflict labels Jan 12, 2026

zzzxl1993 pushed a commit to zzzxl1993/doris that referenced this pull request Jan 13, 2026

[fix](case)fix unstable parquet join runtime filter case. (apache#59557)

83525c2

### What problem does this PR solve? Problem Summary: case from pr : apache#59053

hubgeter mentioned this pull request Jan 24, 2026

[fix](parquet)fix parquet reader cannot push down conjuncts for min-max filter #60197

Merged

16 tasks

This was referenced Feb 3, 2026

[fix](parquet)fix parquet reader lazy materialization cannot filter. #60474

Merged

[fix](parquet)fix parquet reader lazy materialization cannot filter. #60486

Merged

zclllyybb pushed a commit to zclllyybb/doris that referenced this pull request Feb 9, 2026

[Enhancement](parquet)update runtime filter when read next parquet ro…

86b57d3

…w group.(apache#59053) (apache#59181) bp apache#59053

	_total_rf_num = cast_set<int>(_local_state->_helper.runtime_filter_nums());
	_total_rf_num = static_cast<int>(_local_state->_helper.runtime_filter_nums());

		new_lazy_read_ctx.fill_partition_columns = std::move(_lazy_read_ctx.fill_partition_columns);
		new_lazy_read_ctx.fill_missing_columns = std::move(_lazy_read_ctx.fill_missing_columns);

Conversation

hubgeter commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Dec 15, 2025

Uh oh!

hubgeter commented Dec 15, 2025

Uh oh!

github-actions bot commented Dec 16, 2025

Possible file(s) that should be tracked in LFS detected: 🚨

Uh oh!

hubgeter commented Dec 16, 2025

Uh oh!

doris-robot commented Dec 16, 2025

Uh oh!

doris-robot commented Dec 16, 2025

Uh oh!

doris-robot commented Dec 16, 2025

Uh oh!

hello-stephen commented Dec 16, 2025

BE UT Coverage Report

Uh oh!

hello-stephen commented Dec 16, 2025

BE Regression && UT Coverage Report

Uh oh!

github-actions bot commented Dec 16, 2025

Uh oh!

github-actions bot commented Dec 16, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

morningman commented Dec 17, 2025

Uh oh!

doris-robot commented Dec 17, 2025

Uh oh!

doris-robot commented Dec 17, 2025

Uh oh!

doris-robot commented Dec 17, 2025

Uh oh!

hello-stephen commented Dec 17, 2025

BE UT Coverage Report

Uh oh!

hubgeter commented Dec 17, 2025

Uh oh!

hello-stephen commented Dec 23, 2025

BE Regression && UT Coverage Report

Uh oh!

morningman left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

hello-stephen commented Dec 28, 2025

BE Regression && UT Coverage Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

hubgeter commented Dec 15, 2025 •

edited

Loading