[opt](hive) use binary search to prune hive partitions#58877
[opt](hive) use binary search to prune hive partitions#58877morningman merged 4 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 36473 ms |
TPC-DS: Total hot run time: 182268 ms |
ClickBench: Total hot run time: 27.37 s |
|
run buildall |
TPC-H: Total hot run time: 36315 ms |
TPC-DS: Total hot run time: 182033 ms |
ClickBench: Total hot run time: 27.63 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
| }); | ||
|
|
||
| return new SortedPartitionRanges<>(sortedRanges, defaultPartitions); | ||
| } |
There was a problem hiding this comment.
I think you should extract code from NereidsSortedPartitionsCacheManager.loadCache and reuse the same utility function
TPC-H: Total hot run time: 36931 ms |
TPC-DS: Total hot run time: 181153 ms |
ClickBench: Total hot run time: 27.33 s |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
run buildall |
TPC-H: Total hot run time: 36566 ms |
TPC-DS: Total hot run time: 180827 ms |
ClickBench: Total hot run time: 27.41 s |
|
run buildall |
TPC-H: Total hot run time: 35130 ms |
TPC-DS: Total hot run time: 181501 ms |
ClickBench: Total hot run time: 27.64 s |
FE Regression Coverage ReportIncrement line coverage |
Followup #44586 Enable binary search partition pruning optimization for Hive external tables. This PR adds binary search partition pruning support for Hive tables by: - Adding `getSortedPartitionRanges()` method to `ExternalTable` base class - Maintaining sorted partition ranges directly in `HivePartitionValues` for cache lifecycle consistency - Overriding `getSortedPartitionRanges()` in `HMSExternalTable` to provide sorted ranges **Performance improvement (20000 partitions, 1000 queries):** - Binary search enabled: **4.548 seconds** - Binary search disabled: **12.849 seconds** - **~2.8x faster**
Followup apache#44586 Enable binary search partition pruning optimization for Hive external tables. This PR adds binary search partition pruning support for Hive tables by: - Adding `getSortedPartitionRanges()` method to `ExternalTable` base class - Maintaining sorted partition ranges directly in `HivePartitionValues` for cache lifecycle consistency - Overriding `getSortedPartitionRanges()` in `HMSExternalTable` to provide sorted ranges **Performance improvement (20000 partitions, 1000 queries):** - Binary search enabled: **4.548 seconds** - Binary search disabled: **12.849 seconds** - **~2.8x faster**
Followup #44586 Enable binary search partition pruning optimization for Hive external tables. This PR adds binary search partition pruning support for Hive tables by: - Adding `getSortedPartitionRanges()` method to `ExternalTable` base class - Maintaining sorted partition ranges directly in `HivePartitionValues` for cache lifecycle consistency - Overriding `getSortedPartitionRanges()` in `HMSExternalTable` to provide sorted ranges **Performance improvement (20000 partitions, 1000 queries):** - Binary search enabled: **4.548 seconds** - Binary search disabled: **12.849 seconds** - **~2.8x faster**
Followup apache#44586 Enable binary search partition pruning optimization for Hive external tables. This PR adds binary search partition pruning support for Hive tables by: - Adding `getSortedPartitionRanges()` method to `ExternalTable` base class - Maintaining sorted partition ranges directly in `HivePartitionValues` for cache lifecycle consistency - Overriding `getSortedPartitionRanges()` in `HMSExternalTable` to provide sorted ranges **Performance improvement (20000 partitions, 1000 queries):** - Binary search enabled: **4.548 seconds** - Binary search disabled: **12.849 seconds** - **~2.8x faster**
Followup #44586
Enable binary search partition pruning optimization for Hive external tables.
This PR adds binary search partition pruning support for Hive tables by:
getSortedPartitionRanges()method toExternalTablebase classHivePartitionValuesfor cache lifecycle consistencygetSortedPartitionRanges()inHMSExternalTableto provide sorted rangesPerformance improvement (20000 partitions, 1000 queries):