[fix](filecache) Fix LRU persist queue thread-safe#53046
Merged
dataroaring merged 3 commits intoapache:masterfrom Jul 14, 2025
Merged
[fix](filecache) Fix LRU persist queue thread-safe#53046dataroaring merged 3 commits intoapache:masterfrom
dataroaring merged 3 commits intoapache:masterfrom
Conversation
1.fix LRU queue crash use after free 2.fix extra LRU queue info when 'need_to_move' flag unset 3.use concurrent queueu to record queueu change info for thread safety Signed-off-by: zhengyu <[email protected]>
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
gavinchou
previously approved these changes
Jul 10, 2025
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
TPC-H: Total hot run time: 33250 ms |
TPC-DS: Total hot run time: 185686 ms |
ClickBench: Total hot run time: 29.53 s |
Signed-off-by: zhengyu <[email protected]>
a8d12d4
Contributor
Author
|
run buildall |
gavinchou
approved these changes
Jul 11, 2025
Contributor
|
PR approved by at least one committer and no changes requested. |
TPC-H: Total hot run time: 33410 ms |
TPC-DS: Total hot run time: 187053 ms |
ClickBench: Total hot run time: 29.82 s |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 33249 ms |
TPC-DS: Total hot run time: 187466 ms |
ClickBench: Total hot run time: 29.58 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
Author
|
run external |
koarz
approved these changes
Jul 14, 2025
github-actions bot
pushed a commit
that referenced
this pull request
Jul 14, 2025
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
1.fix LRU queue crash use after free
2.fix extra LRU queue info when 'need_to_move' flag unset
3.use concurrent queueu to record queueu change info for thread safety
```
ERROR: AddressSanitizer: heap-use-after-free on address 0x603005548c40 at pc 0x55f28e8c4785 bp 0x7f603582e1f0 sp 0x7f603582e1e8
READ of size 8 at 0x603005548c40 thread T201
#0 0x55f28e8c4784 in std::_Head_base<0ul, doris::io::CacheLRULog*, false>::_Head_base<doris::io::CacheLRULog*>(doris::io::CacheLRULog*&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:190:17
#1 0x55f28e8c4784 in std::_Tuple_impl<0ul, doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>::_Tuple_impl(std::_Tuple_impl<0ul, doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:292:2
#2 0x55f28e8c4784 in std::tuple<doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>::tuple(std::tuple<doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:1079:17
#3 0x55f28e8c4784 in std::_uniq_ptr_impl<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>::uniq_ptr_impl(std::_uniq_ptr_impl<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:162:9
#4 0x55f28e8c4784 in std::_uniq_ptr_data<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>, true, true>::uniq_ptr_data(std::_uniq_ptr_data<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>, true, true>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:211:7
#5 0x55f28e8c4784 in std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>::unique_ptr(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:327:7
#6 0x55f28e8c4784 in doris::io::LRUQueueRecorder::replay_queue_event(doris::io::FileCacheType) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:40:20
#7 0x55f28e82d620 in doris::io::BlockFileCache::run_background_lru_log_replay() /root/doris/be/src/io/cache/block_file_cache.cpp:2242:24
#8 0x55f2cdc2720f in execute_native_thread_routine /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc+-v3/src/c11/../../../../../libstdc-v3/src/c+11/thread.cc:82:18
#9 0x7f61f1842608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
#10 0x7f61f1aef132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
0x603005548c40 is located 16 bytes inside of 24-byte region [0x603005548c30,0x603005548c48)
freed by thread T201 here:
#0 0x55f28e51680d in operator delete(void*) (/home/work/unlimit_teamcity/TeamCity/Agents/20250708205944agent_172.16.0.48_1/work/60183217f6ee2a9c/output/be/lib/doris_be+0x3975a80d) (BuildId: 8b6ba6101e736655)
#1 0x55f28e8c3ce0 in std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::pop_front() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1198:15
#2 0x55f28e8c3ce0 in doris::io::LRUQueueRecorder::replay_queue_event(doris::io::FileCacheType) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:41:19
#3 0x55f28e82d620 in doris::io::BlockFileCache::run_background_lru_log_replay() /root/doris/be/src/io/cache/block_file_cache.cpp:2242:24
#4 0x55f2cdc2720f in execute_native_thread_routine /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc+-v3/src/c11/../../../../../libstdc-v3/src/c+11/thread.cc:82:18
previously allocated by thread T607 (CumuCompactionT) here:
#0 0x55f28e515fad in operator new(unsigned long) (/home/work/unlimit_teamcity/TeamCity/Agents/20250708205944agent_172.16.0.48_1/work/60183217f6ee2a9c/output/be/lib/doris_be+0x39759fad) (BuildId: 8b6ba6101e736655)
#1 0x55f28e8c660d in __gnu_cxx::new_allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::allocate(unsigned long, void const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:121:27
#2 0x55f28e8c660d in std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::allocate(unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:173:32
#3 0x55f28e8c660d in std::allocator_traits<std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>>::allocate(std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>&, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:460:20
#4 0x55f28e8c660d in std::__cxx11::_List_base<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_get_node() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:442:16
#5 0x55f28e8c660d in std::List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>* std::_cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_create_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:634:21
#6 0x55f28e8c660d in void std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_insert<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>(std::_List_iterator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>, std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1911:18
#7 0x55f28e8c3522 in std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::push_back(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1217:15
#8 0x55f28e8c3522 in doris::io::LRUQueueRecorder::record_queue_event(doris::io::FileCacheType, doris::io::CacheLRULogType, doris::io::UInt128Wrapper, unsigned long, unsigned long) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:29:15
#9 0x55f28e82f09b in doris::io::BlockFileCache::use_cell(doris::io::BlockFileCache::FileBlockCell const&, std::__cxx11::list<std::shared_ptr<doris::io::FileBlock>, std::allocator<std::shared_ptr<doris::io::FileBlock>>>*, bool, std::lock_guard<std::mutex>&) /root/doris/be/src/io/cache/block_file_cache.cpp:380:20
#10 0x55f28e833d1b in doris::io::BlockFileCache::get_impl[abi:cxx11](doris::io::UInt128Wrapper const&, doris::io::CacheContext const&, doris::io::FileBlock::Range const&, std::lock_guard<std::mutex>&) /root/doris/be/src/io/cache/block_file_cache.cpp:572:13
#11 0x55f28e83b4ef in doris::io::BlockFileCache::get_or_set(doris::io::UInt128Wrapper const&, unsigned long, unsigned long, doris::io::CacheContext&) /root/doris/be/src/io/cache/block_file_cache.cpp:762:27
#12 0x55f28e7ffcee in doris::io::CachedRemoteFileReader::read_at_impl(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /root/doris/be/src/io/cache/cached_remote_file_reader.cpp:191:21
#13 0x55f28e7f8017 in doris::io::FileReader::read_at(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /root/doris/be/src/io/fs/file_reader.cpp:34:17
```
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [x] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [x] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
---------
Signed-off-by: zhengyu <[email protected]>
eldenmoon
pushed a commit
that referenced
this pull request
Jul 14, 2025
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
1.fix LRU queue crash use after free
2.fix extra LRU queue info when 'need_to_move' flag unset
3.use concurrent queueu to record queueu change info for thread safety
```
ERROR: AddressSanitizer: heap-use-after-free on address 0x603005548c40 at pc 0x55f28e8c4785 bp 0x7f603582e1f0 sp 0x7f603582e1e8
READ of size 8 at 0x603005548c40 thread T201
#0 0x55f28e8c4784 in std::_Head_base<0ul, doris::io::CacheLRULog*, false>::_Head_base<doris::io::CacheLRULog*>(doris::io::CacheLRULog*&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:190:17
#1 0x55f28e8c4784 in std::_Tuple_impl<0ul, doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>::_Tuple_impl(std::_Tuple_impl<0ul, doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:292:2
#2 0x55f28e8c4784 in std::tuple<doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>::tuple(std::tuple<doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:1079:17
#3 0x55f28e8c4784 in std::_uniq_ptr_impl<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>::uniq_ptr_impl(std::_uniq_ptr_impl<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:162:9
#4 0x55f28e8c4784 in std::_uniq_ptr_data<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>, true, true>::uniq_ptr_data(std::_uniq_ptr_data<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>, true, true>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:211:7
#5 0x55f28e8c4784 in std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>::unique_ptr(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:327:7
#6 0x55f28e8c4784 in doris::io::LRUQueueRecorder::replay_queue_event(doris::io::FileCacheType) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:40:20
#7 0x55f28e82d620 in doris::io::BlockFileCache::run_background_lru_log_replay() /root/doris/be/src/io/cache/block_file_cache.cpp:2242:24
#8 0x55f2cdc2720f in execute_native_thread_routine /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc+-v3/src/c11/../../../../../libstdc-v3/src/c+11/thread.cc:82:18
#9 0x7f61f1842608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
#10 0x7f61f1aef132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
0x603005548c40 is located 16 bytes inside of 24-byte region [0x603005548c30,0x603005548c48)
freed by thread T201 here:
#0 0x55f28e51680d in operator delete(void*) (/home/work/unlimit_teamcity/TeamCity/Agents/20250708205944agent_172.16.0.48_1/work/60183217f6ee2a9c/output/be/lib/doris_be+0x3975a80d) (BuildId: 8b6ba6101e736655)
#1 0x55f28e8c3ce0 in std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::pop_front() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1198:15
#2 0x55f28e8c3ce0 in doris::io::LRUQueueRecorder::replay_queue_event(doris::io::FileCacheType) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:41:19
#3 0x55f28e82d620 in doris::io::BlockFileCache::run_background_lru_log_replay() /root/doris/be/src/io/cache/block_file_cache.cpp:2242:24
#4 0x55f2cdc2720f in execute_native_thread_routine /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc+-v3/src/c11/../../../../../libstdc-v3/src/c+11/thread.cc:82:18
previously allocated by thread T607 (CumuCompactionT) here:
#0 0x55f28e515fad in operator new(unsigned long) (/home/work/unlimit_teamcity/TeamCity/Agents/20250708205944agent_172.16.0.48_1/work/60183217f6ee2a9c/output/be/lib/doris_be+0x39759fad) (BuildId: 8b6ba6101e736655)
#1 0x55f28e8c660d in __gnu_cxx::new_allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::allocate(unsigned long, void const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:121:27
#2 0x55f28e8c660d in std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::allocate(unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:173:32
#3 0x55f28e8c660d in std::allocator_traits<std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>>::allocate(std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>&, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:460:20
#4 0x55f28e8c660d in std::__cxx11::_List_base<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_get_node() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:442:16
#5 0x55f28e8c660d in std::List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>* std::_cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_create_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:634:21
#6 0x55f28e8c660d in void std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_insert<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>(std::_List_iterator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>, std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1911:18
#7 0x55f28e8c3522 in std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::push_back(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1217:15
#8 0x55f28e8c3522 in doris::io::LRUQueueRecorder::record_queue_event(doris::io::FileCacheType, doris::io::CacheLRULogType, doris::io::UInt128Wrapper, unsigned long, unsigned long) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:29:15
#9 0x55f28e82f09b in doris::io::BlockFileCache::use_cell(doris::io::BlockFileCache::FileBlockCell const&, std::__cxx11::list<std::shared_ptr<doris::io::FileBlock>, std::allocator<std::shared_ptr<doris::io::FileBlock>>>*, bool, std::lock_guard<std::mutex>&) /root/doris/be/src/io/cache/block_file_cache.cpp:380:20
#10 0x55f28e833d1b in doris::io::BlockFileCache::get_impl[abi:cxx11](doris::io::UInt128Wrapper const&, doris::io::CacheContext const&, doris::io::FileBlock::Range const&, std::lock_guard<std::mutex>&) /root/doris/be/src/io/cache/block_file_cache.cpp:572:13
#11 0x55f28e83b4ef in doris::io::BlockFileCache::get_or_set(doris::io::UInt128Wrapper const&, unsigned long, unsigned long, doris::io::CacheContext&) /root/doris/be/src/io/cache/block_file_cache.cpp:762:27
#12 0x55f28e7ffcee in doris::io::CachedRemoteFileReader::read_at_impl(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /root/doris/be/src/io/cache/cached_remote_file_reader.cpp:191:21
#13 0x55f28e7f8017 in doris::io::FileReader::read_at(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /root/doris/be/src/io/fs/file_reader.cpp:34:17
```
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [x] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [x] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
---------
Signed-off-by: zhengyu <[email protected]>
morrySnow
pushed a commit
that referenced
this pull request
Jul 15, 2025
…#53220) Cherry-picked from #53046 Signed-off-by: zhengyu <[email protected]> Co-authored-by: zhengyu <[email protected]>
Contributor
Author
|
this pr cannot be picked to branch-3.0 because its pre-PR #52807 is not going to merge some how |
freemandealer
added a commit
to freemandealer/doris
that referenced
this pull request
Jul 22, 2025
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
1.fix LRU queue crash use after free
2.fix extra LRU queue info when 'need_to_move' flag unset
3.use concurrent queueu to record queueu change info for thread safety
```
ERROR: AddressSanitizer: heap-use-after-free on address 0x603005548c40 at pc 0x55f28e8c4785 bp 0x7f603582e1f0 sp 0x7f603582e1e8
READ of size 8 at 0x603005548c40 thread T201
#0 0x55f28e8c4784 in std::_Head_base<0ul, doris::io::CacheLRULog*, false>::_Head_base<doris::io::CacheLRULog*>(doris::io::CacheLRULog*&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:190:17
#1 0x55f28e8c4784 in std::_Tuple_impl<0ul, doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>::_Tuple_impl(std::_Tuple_impl<0ul, doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:292:2
#2 0x55f28e8c4784 in std::tuple<doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>::tuple(std::tuple<doris::io::CacheLRULog*, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/tuple:1079:17
apache#3 0x55f28e8c4784 in std::_uniq_ptr_impl<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>::uniq_ptr_impl(std::_uniq_ptr_impl<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:162:9
apache#4 0x55f28e8c4784 in std::_uniq_ptr_data<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>, true, true>::uniq_ptr_data(std::_uniq_ptr_data<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>, true, true>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:211:7
apache#5 0x55f28e8c4784 in std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>::unique_ptr(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:327:7
apache#6 0x55f28e8c4784 in doris::io::LRUQueueRecorder::replay_queue_event(doris::io::FileCacheType) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:40:20
apache#7 0x55f28e82d620 in doris::io::BlockFileCache::run_background_lru_log_replay() /root/doris/be/src/io/cache/block_file_cache.cpp:2242:24
apache#8 0x55f2cdc2720f in execute_native_thread_routine /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc+-v3/src/c11/../../../../../libstdc-v3/src/c+11/thread.cc:82:18
apache#9 0x7f61f1842608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
apache#10 0x7f61f1aef132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
0x603005548c40 is located 16 bytes inside of 24-byte region [0x603005548c30,0x603005548c48)
freed by thread T201 here:
#0 0x55f28e51680d in operator delete(void*) (/home/work/unlimit_teamcity/TeamCity/Agents/20250708205944agent_172.16.0.48_1/work/60183217f6ee2a9c/output/be/lib/doris_be+0x3975a80d) (BuildId: 8b6ba6101e736655)
#1 0x55f28e8c3ce0 in std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::pop_front() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1198:15
#2 0x55f28e8c3ce0 in doris::io::LRUQueueRecorder::replay_queue_event(doris::io::FileCacheType) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:41:19
apache#3 0x55f28e82d620 in doris::io::BlockFileCache::run_background_lru_log_replay() /root/doris/be/src/io/cache/block_file_cache.cpp:2242:24
apache#4 0x55f2cdc2720f in execute_native_thread_routine /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc+-v3/src/c11/../../../../../libstdc-v3/src/c+11/thread.cc:82:18
previously allocated by thread T607 (CumuCompactionT) here:
#0 0x55f28e515fad in operator new(unsigned long) (/home/work/unlimit_teamcity/TeamCity/Agents/20250708205944agent_172.16.0.48_1/work/60183217f6ee2a9c/output/be/lib/doris_be+0x39759fad) (BuildId: 8b6ba6101e736655)
#1 0x55f28e8c660d in __gnu_cxx::new_allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::allocate(unsigned long, void const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:121:27
#2 0x55f28e8c660d in std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::allocate(unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:173:32
apache#3 0x55f28e8c660d in std::allocator_traits<std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>>::allocate(std::allocator<std::_List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>&, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:460:20
apache#4 0x55f28e8c660d in std::__cxx11::_List_base<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_get_node() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:442:16
apache#5 0x55f28e8c660d in std::List_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>* std::_cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_create_node<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:634:21
apache#6 0x55f28e8c660d in void std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::_M_insert<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>(std::_List_iterator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>, std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1911:18
apache#7 0x55f28e8c3522 in std::__cxx11::list<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>, std::allocator<std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>>>::push_back(std::unique_ptr<doris::io::CacheLRULog, std::default_delete<doris::io::CacheLRULog>>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_list.h:1217:15
apache#8 0x55f28e8c3522 in doris::io::LRUQueueRecorder::record_queue_event(doris::io::FileCacheType, doris::io::CacheLRULogType, doris::io::UInt128Wrapper, unsigned long, unsigned long) /root/doris/be/src/io/cache/lru_queue_recorder.cpp:29:15
apache#9 0x55f28e82f09b in doris::io::BlockFileCache::use_cell(doris::io::BlockFileCache::FileBlockCell const&, std::__cxx11::list<std::shared_ptr<doris::io::FileBlock>, std::allocator<std::shared_ptr<doris::io::FileBlock>>>*, bool, std::lock_guard<std::mutex>&) /root/doris/be/src/io/cache/block_file_cache.cpp:380:20
apache#10 0x55f28e833d1b in doris::io::BlockFileCache::get_impl[abi:cxx11](doris::io::UInt128Wrapper const&, doris::io::CacheContext const&, doris::io::FileBlock::Range const&, std::lock_guard<std::mutex>&) /root/doris/be/src/io/cache/block_file_cache.cpp:572:13
apache#11 0x55f28e83b4ef in doris::io::BlockFileCache::get_or_set(doris::io::UInt128Wrapper const&, unsigned long, unsigned long, doris::io::CacheContext&) /root/doris/be/src/io/cache/block_file_cache.cpp:762:27
apache#12 0x55f28e7ffcee in doris::io::CachedRemoteFileReader::read_at_impl(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /root/doris/be/src/io/cache/cached_remote_file_reader.cpp:191:21
apache#13 0x55f28e7f8017 in doris::io::FileReader::read_at(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /root/doris/be/src/io/fs/file_reader.cpp:34:17
```
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [x] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [x] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
---------
Signed-off-by: zhengyu <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
1.fix LRU queue crash use after free
2.fix extra LRU queue info when 'need_to_move' flag unset
3.use concurrent queueu to record queueu change info for thread safety
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)