[refactor](predicate) Refactor predicates on external tables#58905
[refactor](predicate) Refactor predicates on external tables#58905Gabriel39 merged 5 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 35501 ms |
TPC-DS: Total hot run time: 181446 ms |
ClickBench: Total hot run time: 27.14 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
TPC-H: Total hot run time: 35185 ms |
TPC-DS: Total hot run time: 180781 ms |
ClickBench: Total hot run time: 27.13 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 35194 ms |
TPC-DS: Total hot run time: 178060 ms |
ClickBench: Total hot run time: 27.24 s |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
run buildall |
|
PR approved by at least one committer and no changes requested. |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 35290 ms |
TPC-DS: Total hot run time: 178572 ms |
ClickBench: Total hot run time: 27.28 s |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
1 similar comment
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
### What problem does this PR solve? Introduced by #58905 ==2037076==ERROR: AddressSanitizer: heap-use-after-free on address 0x7baaae908730 at pc 0x561b769a1fd0 bp 0x7b3caf4ebdf0 sp 0x7b3caf4ebde8 22:30:08 READ of size 1 at 0x7baaae908730 thread T12303 (rs_normal [work) 22:30:08 #0 0x561b769a1fcf in doris::(anonymous namespace)::string_compare(char const*, long, char const*, long, long) /root/doris/be/src/vec/common/string_ref.h:170:29 22:30:08 #1 0x561b769a1fcf in doris::StringRef::compare(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:259:30 22:30:08 #2 0x561b76f537cd in doris::StringRef::ge(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:282:52 22:30:08 #3 0x561b76f537cd in doris::StringRef::operator>=(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:292:60 22:30:08 #4 0x561b76f537cd in bool doris::Compare::greater_equal<doris::StringRef>(doris::StringRef const&, doris::StringRef const&) /root/doris/be/src/common/compare.h:42:18 22:30:08 #5 0x561b76f537cd in doris::ComparisonPredicateBase<(doris::PrimitiveType)23, (doris::PredicateType)6>::camp_field(doris::vectorized::Field const&, doris::vectorized::Field const&) const /root/doris/be/src/olap/comparison_predicate.h:192:20 22:30:08 #6 0x561b76f4baa4 in doris::ComparisonPredicateBase<(doris::PrimitiveType)23, (doris::PredicateType)6>::evaluate_and(doris::vectorized::ParquetPredicate::ColumnStat*) const /root/doris/be/src/olap/comparison_predicate.h:207:26 22:30:08 #7 0x561b76765284 in doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::ColumnStat*) const /root/doris/be/src/olap/block_column_predicate.h:251:42 22:30:08 #8 0x561b89acd735 in doris::vectorized::ParquetReader::_process_column_stat_filter(tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, bool*, bool*, bool*) /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1225:25 22:30:08 #9 0x561b89ac8dd7 in doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1108:9 22:30:08 #10 0x561b89ac3e73 in doris::vectorized::ParquetReader::_next_row_group_reader() /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:718:9 22:30:08 #11 0x561b89ac008f in doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:607:21 22:30:08 #12 0x561b8a07c6f7 in doris::vectorized::HiveReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) /root/doris/be/src/vec/exec/format/table/hive_reader.cpp:32:5 22:30:08 #13 0x561b89fee256 in doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) /root/doris/be/src/vec/exec/format/table/table_format_reader.h:81:16 22:30:08 #14 0x561b89f71b97 in doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/file_scanner.cpp:472:13 22:30:08 #15 0x561b89f7086f in doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/file_scanner.cpp:409:17 22:30:08 #16 0x561b8a19f86e in doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/scanner.cpp:109:17 22:30:08 #17 0x561b8a19f0a6 in doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/scanner.cpp:85:16 22:30:08 #18 0x561b8a1ccd0f in doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:173:5 22:30:08 #19 0x561b8a1d6875 in doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()::operator()() const::'lambda'()::operator()() const /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:76:17 22:30:08 #20 0x561b8a1d6875 in doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()::operator()() const /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:75:27 22:30:08 #21 0x561b8a1d6875 in bool std::__invoke_impl<bool, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&>(std::__invoke_other, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&) /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/invoke.h:63:14 22:30:08 #22 0x561b8a1d6875 in std::enable_if<is_invocable_r_v<bool, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&>, bool>::type std::__invoke_r<bool, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&>(doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&) /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/invoke.h:116:9 22:30:08 #23 0x561b8a1d6875 in std::_Function_handler<bool (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()>::_M_invoke(std::_Any_data const&) /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292:9 22:30:08 #24 0x561b8a1d5f07 in std::function<bool ()>::operator()() const /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:593:9 22:30:08 #25 0x561b8a1d5f07 in doris::vectorized::ScannerSplitRunner::process_for(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:407:25 22:30:08 #26 0x561b8a2c56d4 in doris::vectorized::PrioritizedSplitRunner::process() /root/doris/be/src/vec/exec/executor/time_sharing/prioritized_split_runner.cpp:103:35 22:30:08 #27 0x561b8a29045c in doris::vectorized::TimeSharingTaskExecutor::_dispatch_thread() /root/doris/be/src/vec/exec/executor/time_sharing/time_sharing_task_executor.cpp:570:77 22:30:08 #28 0x561b7b9fecb6 in std::function<void ()>::operator()() const /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:593:9 22:30:08 #29 0x561b7b9fecb6 in doris::Thread::supervise_thread(void*) /root/doris/be/src/util/thread.cpp:460:5 22:30:08 #30 0x561b76044d26 in asan_thread_start(void*) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P1/Cluster0/be/lib/doris_be+0x23962d26) 22:30:08 #31 0x7f4aaae68608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8 22:30:08 #32 0x7f4aaad7b132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
…e#59098) ### What problem does this PR solve? Introduced by apache#58905 ==2037076==ERROR: AddressSanitizer: heap-use-after-free on address 0x7baaae908730 at pc 0x561b769a1fd0 bp 0x7b3caf4ebdf0 sp 0x7b3caf4ebde8 22:30:08 READ of size 1 at 0x7baaae908730 thread T12303 (rs_normal [work) 22:30:08 #0 0x561b769a1fcf in doris::(anonymous namespace)::string_compare(char const*, long, char const*, long, long) /root/doris/be/src/vec/common/string_ref.h:170:29 22:30:08 apache#1 0x561b769a1fcf in doris::StringRef::compare(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:259:30 22:30:08 apache#2 0x561b76f537cd in doris::StringRef::ge(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:282:52 22:30:08 apache#3 0x561b76f537cd in doris::StringRef::operator>=(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:292:60 22:30:08 apache#4 0x561b76f537cd in bool doris::Compare::greater_equal<doris::StringRef>(doris::StringRef const&, doris::StringRef const&) /root/doris/be/src/common/compare.h:42:18 22:30:08 apache#5 0x561b76f537cd in doris::ComparisonPredicateBase<(doris::PrimitiveType)23, (doris::PredicateType)6>::camp_field(doris::vectorized::Field const&, doris::vectorized::Field const&) const /root/doris/be/src/olap/comparison_predicate.h:192:20 22:30:08 apache#6 0x561b76f4baa4 in doris::ComparisonPredicateBase<(doris::PrimitiveType)23, (doris::PredicateType)6>::evaluate_and(doris::vectorized::ParquetPredicate::ColumnStat*) const /root/doris/be/src/olap/comparison_predicate.h:207:26 22:30:08 apache#7 0x561b76765284 in doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::ColumnStat*) const /root/doris/be/src/olap/block_column_predicate.h:251:42 22:30:08 apache#8 0x561b89acd735 in doris::vectorized::ParquetReader::_process_column_stat_filter(tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, bool*, bool*, bool*) /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1225:25 22:30:08 apache#9 0x561b89ac8dd7 in doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1108:9 22:30:08 apache#10 0x561b89ac3e73 in doris::vectorized::ParquetReader::_next_row_group_reader() /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:718:9 22:30:08 apache#11 0x561b89ac008f in doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:607:21 22:30:08 apache#12 0x561b8a07c6f7 in doris::vectorized::HiveReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) /root/doris/be/src/vec/exec/format/table/hive_reader.cpp:32:5 22:30:08 apache#13 0x561b89fee256 in doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) /root/doris/be/src/vec/exec/format/table/table_format_reader.h:81:16 22:30:08 apache#14 0x561b89f71b97 in doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/file_scanner.cpp:472:13 22:30:08 apache#15 0x561b89f7086f in doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/file_scanner.cpp:409:17 22:30:08 apache#16 0x561b8a19f86e in doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/scanner.cpp:109:17 22:30:08 apache#17 0x561b8a19f0a6 in doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) /root/doris/be/src/vec/exec/scan/scanner.cpp:85:16 22:30:08 apache#18 0x561b8a1ccd0f in doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:173:5 22:30:08 apache#19 0x561b8a1d6875 in doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()::operator()() const::'lambda'()::operator()() const /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:76:17 22:30:08 apache#20 0x561b8a1d6875 in doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()::operator()() const /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:75:27 22:30:08 apache#21 0x561b8a1d6875 in bool std::__invoke_impl<bool, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&>(std::__invoke_other, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&) /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/invoke.h:63:14 22:30:08 apache#22 0x561b8a1d6875 in std::enable_if<is_invocable_r_v<bool, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&>, bool>::type std::__invoke_r<bool, doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&>(doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()&) /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/invoke.h:116:9 22:30:08 apache#23 0x561b8a1d6875 in std::_Function_handler<bool (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::'lambda'()>::_M_invoke(std::_Any_data const&) /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292:9 22:30:08 apache#24 0x561b8a1d5f07 in std::function<bool ()>::operator()() const /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:593:9 22:30:08 apache#25 0x561b8a1d5f07 in doris::vectorized::ScannerSplitRunner::process_for(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:407:25 22:30:08 apache#26 0x561b8a2c56d4 in doris::vectorized::PrioritizedSplitRunner::process() /root/doris/be/src/vec/exec/executor/time_sharing/prioritized_split_runner.cpp:103:35 22:30:08 apache#27 0x561b8a29045c in doris::vectorized::TimeSharingTaskExecutor::_dispatch_thread() /root/doris/be/src/vec/exec/executor/time_sharing/time_sharing_task_executor.cpp:570:77 22:30:08 apache#28 0x561b7b9fecb6 in std::function<void ()>::operator()() const /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:593:9 22:30:08 apache#29 0x561b7b9fecb6 in doris::Thread::supervise_thread(void*) /root/doris/be/src/util/thread.cpp:460:5 22:30:08 apache#30 0x561b76044d26 in asan_thread_start(void*) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P1/Cluster0/be/lib/doris_be+0x23962d26) 22:30:08 apache#31 0x7f4aaae68608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8 22:30:08 apache#32 0x7f4aaad7b132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
### What problem does this PR solve? pick #57397 #58283 #58290 #58282 #58832 #58905 #58960 #59005 #59088 #59098 #59126 #59187 #59581 #59625 #59775 ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)