-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Description
Search before asking
- I had searched in the issues and found no similar issues.
Version
branch 3, branch 3.1 and master
What's Wrong?
in paimon table scan case, if be jvm has have not enough memory:
- oom
- [warning][gc,alloc] Thread-272340: Retried waiting for GCLocker too often allocating 172082 words)
jni_connector will fail in some logic, then be may be crash, the stack shows below:
*** Query id: 7ac049e2ed6644aa-8ae4d8cbab5119e9 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1755692288 (unix time) try "date -d @1755692288" if you are using GNU date ***
*** Current BE git commitID: 39f9074cec ***
*** SIGSEGV address not mapped to object (@0x238) received by PID 267221 (TID 890774 OR 0x7a081328b700) from PID 568; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_release/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /home/doris/dorisdb_pub/java17/lib/server/libjvm.so
2# JVM_handle_linux_signal in /home/doris/dorisdb_pub/java17/lib/server/libjvm.so
3# 0x00007F0AD94D6770 in /lib64/libc.so.6
4# OopStorage::Block::release_entries(unsigned long, OopStorage*) in /home/doris/dorisdb_pub/java17/lib/server/libjvm.so
5# OopStorage::release(oopDesc* const*) in /home/doris/dorisdb_pub/java17/lib/server/libjvm.so
6# jni_DeleteGlobalRef in /home/doris/dorisdb_pub/java17/lib/server/libjvm.so
7# doris::vectorized::JniConnector::close() at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/jni_connector.cpp:182
8# doris::vectorized::JniReader::close() in /home/doris/dorisdb_pub/be/lib/doris_be
9# doris::vectorized::VFileScanner::close(doris::RuntimeState*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vfile_scanner.cpp:1175
10# doris::vectorized::ScannerDelegate::~ScannerDelegate() at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vscan_node.h:36
11# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:323
12# std::_Function_handler<void (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_1::operator()() const::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
13# doris::ThreadPool::dispatch_thread() in /home/doris/dorisdb_pub/be/lib/doris_be
Through code analysis, we can determine that the cause of the BE crash was repeated calls to the jni_connector close method, leading to a double free of the related JNI memory.
This is an occasional issue, Failures in massive parallel scanning tasks significantly raise the probability。
What You Expected?
be will not crash, just sql return error
How to Reproduce?
- set very small be jvm size
- create paimon table with massive small files
- select this table
be may be crash
Anything Else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels