[fix](deadlock) avoid deadlock on tabletInvertedIndex#54197
[fix](deadlock) avoid deadlock on tabletInvertedIndex#54197dataroaring merged 1 commit intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
|
|
||
| } | ||
|
|
||
| private TransactionState unprotectedGetTransactionState(Long transactionId) { |
There was a problem hiding this comment.
unprotectedGetTransactionState 这个函数调用的地方很多啊,都有加速写锁,看看需不需要去掉啊
|
when do "CommitThread: hold DatabaseTransactionMgr.writeLock and try to acquire TabletInvertedIndex.readLock. " ? only find function checkCommitStatus try to acquire the TabletInvertedIndex.lock, but when calling function checkCommitStatus seems no hold the DatabaseTransactionMgr.writeLock |
StampLock used by tabletInvertedIndex is a fair lock. Comments from StampLock: In particular, we use the phase-fair anti-barging rule: If an incoming reader arrives while read lock is held but there is a queued writer, this incoming reader is queued. A dead lock happens like below: ReportThread: hold TabletInvertedIndex.readLock and try to acquire DatabaseTransactionMgr.readLock. CommitThread: hold DatabaseTransactionMgr.writeLock and try to acquire TabletInvertedIndex.readLock. TabletOperatorThread: try to acquire TabletInvertedIndex.writeLock.
d37fd15 to
4468d0b
Compare
|
run buildall |
TPC-H: Total hot run time: 33553 ms |
TPC-DS: Total hot run time: 159959 ms |
ClickBench: Total hot run time: 33.18 s |
FE UT Coverage ReportIncrement line coverage |
fe/fe-core/src/main/java/org/apache/doris/transaction/DatabaseTransactionMgr.java
Show resolved
Hide resolved
|
PR approved by anyone and no changes requested. |
|
PR approved by at least one committer and no changes requested. |
StampLock used by tabletInvertedIndex is a fair lock. Comments from StampLock: In particular, we use the phase-fair anti-barging rule: If an incoming reader arrives while read lock is held but there is a queued writer, this incoming reader is queued. A dead lock happens like below: ReportThread: hold TabletInvertedIndex.readLock and try to acquire DatabaseTransactionMgr.readLock. CommitThread: try to acquire TabletInvertedIndex.readLock. TabletOperatorThread: try to acquire TabletInvertedIndex.writeLock. CommitThread: hold DatabaseTransactionMgr.writeLock and waiting for editlog's lock. The DatabaseTransactionMgr.writeLock is held when writing editlog, so sometimes it is time consuming and results in time consuming tablet report. Co-authored-by: Yongqiang YANG <yangyogqiang@selectdb.com>
StampLock used by tabletInvertedIndex is a fair lock. Comments from StampLock: In particular, we use the phase-fair anti-barging rule: If an incoming reader arrives while read lock is held but there is a queued writer, this incoming reader is queued. A dead lock happens like below: ReportThread: hold TabletInvertedIndex.readLock and try to acquire DatabaseTransactionMgr.readLock. CommitThread: try to acquire TabletInvertedIndex.readLock. TabletOperatorThread: try to acquire TabletInvertedIndex.writeLock. CommitThread: hold DatabaseTransactionMgr.writeLock and waiting for editlog's lock. The DatabaseTransactionMgr.writeLock is held when writing editlog, so sometimes it is time consuming and results in time consuming tablet report. Co-authored-by: Yongqiang YANG <yangyogqiang@selectdb.com>
StampLock used by tabletInvertedIndex is a fair lock.
Comments from StampLock:
In particular, we use the phase-fair anti-barging rule: If an incoming reader arrives while read lock is held but there is a queued writer, this incoming reader is queued.
A dead lock happens like below:
ReportThread: hold TabletInvertedIndex.readLock and try to acquire DatabaseTransactionMgr.readLock.
CommitThread: try to acquire TabletInvertedIndex.readLock.
TabletOperatorThread: try to acquire TabletInvertedIndex.writeLock.
CommitThread: hold DatabaseTransactionMgr.writeLock and waiting for editlog's lock.
The DatabaseTransactionMgr.writeLock is held when writing editlog, so sometimes it is time consuming and results in time consuming tablet report.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)