Skip to content

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented Nov 2, 2025

faulthandler now detects if a frame or a code object is invalid or freed.

Add helper functions:

  • _PyCode_SafeAddr2Line()
  • _PyFrame_SafeGetCode()
  • _PyFrame_SafeGetLasti()

_PyMem_IsPtrFreed() now also considers the pointer 0x1 as freed.

faulthandler now detects if a frame or a code object is invalid or
freed.

Add helper functions:

* _PyCode_SafeAddr2Line()
* _PyFrame_SafeGetCode()
* _PyFrame_SafeGetBytecode()
* _PyFrame_SafeGetLasti()

_PyMem_IsPtrFreed() now also considers the pointer 0x1 as freed.
@sergey-miryanov
Copy link
Contributor

I believe this PR minimizes the chances of getting a segfault. However, I have a feeling that it does not eliminate the possibility entirely. Unfortunately, I don't have a better solution.

@sergey-miryanov
Copy link
Contributor

Perhaps we could add a new function that attempts to attach the thread state with a timeout. If successful, we will have a more reliable solution. WDYT?

@vstinner
Copy link
Member Author

vstinner commented Nov 2, 2025

Perhaps we could add a new function that attempts to attach the thread state with a timeout. If successful, we will have a more reliable solution. WDYT?

I don't think that it's worth it.

Another faulthandler feature is faulthandler.enable() and faulthandler.register(): register signal handlers. A Unix signal can occur anytime, so we cannot acquire the GIL or any other lock.

faulthandler is a debugger, it's not designed to be 100% accurate, but to "just work" in any case (GIL or not, in a signal handler or not, etc.) without taking any lock.

@vstinner
Copy link
Member Author

vstinner commented Nov 2, 2025

@sergey-miryanov @picnixz: I pushed a change to address your reviews.

@efimov-mikhail
Copy link
Member

LGTM, but I have very little expertise here, so I'll not push approve button :-).

@vstinner
Copy link
Member Author

vstinner commented Nov 2, 2025

I believe this PR minimizes the chances of getting a segfault. However, I have a feeling that it does not eliminate the possibility entirely. Unfortunately, I don't have a better solution.

faulthandler is implemented with multiple heuristics and best effort. It's ok if it does crash, since it should only be triggered when something already goes wrong.

Only faulthandler.dump_traceback() should be safe since it's called with an attached thread safe.

Copy link
Contributor

@sergey-miryanov sergey-miryanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From point of my little experience, it looks good to me.

@sergey-miryanov
Copy link
Contributor

Thanks!

Comment on lines +37 to +41
if (_PyMem_IsPtrFreed(ptr)) {
return NULL;
}
PyObject *executable = PyStackRef_AsPyObjectBorrow(f->f_executable);
if (_PyObject_IsFreed(executable)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this patch work on release builds? I thought _PyMem_IsPtrFreed and _PyObject_IsFreed only worked if debug allocators are enabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I didn't try on a release build! _PyMem_IsPtrFreed() and _PyObject_IsFreed() work better with debug allocators, you're correct. But they detect freed (reused) memory in some cases in a release build as well. They are just less reliable in release mode.

I pushed more changes to make the heuristics stricter.

@vstinner
Copy link
Member Author

vstinner commented Nov 3, 2025

@picnixz @ZeroIntensity: Do these changes look reasonable to you? Even if they are not perfect, they allow dumping the traceback in more cases.

I wrote a script to run the reproducer in a loop (see below). I ran the stress test 3 times.

  • Debug build:
    • Without these changes: crash after 1 run, crash after 6 runs, crash after 1 run
    • With these changes: crash after 3970 runs, <stopped after 10,000 success without crash>, <stopped after 10,000 success without crash>.
  • Release build:
    • Without these changes: crash after 6 runs, crash after 59 runs, crash after 36 runs
    • With these changes: crash after 550 runs, crash after 3750 runs, crash after 7845 runs.

Stress test:

import subprocess, sys
cmd=[sys.executable, 'reproducer.py']
success = 0
while True:
    proc = subprocess.run(cmd)
    if proc.returncode != 1:
        print()
        print(f"Exitcode {proc.returncode}")
        break
    success += 1
    print(success)

reproducer.py:

import faulthandler
import sys
faulthandler.dump_traceback_later(10 * 1e-308, exit=True, file=sys.__stderr__)
assert False

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea anyway, yeah. We might want to document that faulthandler.dump_traceback_later can sometimes crash during interpreter finalization.

Comment on lines 27 to 28
// Similar to _PyFrame_GetCode(), but return NULL if the frame is invalid or
// freed. Used by dump_frame() in Python/traceback.c.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth adding a comment that this isn't completely reliable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I added a sentence for that in the 3 added functions.

@vstinner vstinner merged commit a84181c into python:main Nov 4, 2025
46 checks passed
@vstinner vstinner deleted the faulthandler_freed branch November 4, 2025 10:48
@vstinner
Copy link
Member Author

vstinner commented Nov 4, 2025

Merged, thanks for your reviews!

@vstinner vstinner added the needs backport to 3.14 bugs and security fixes label Nov 4, 2025
@miss-islington-app
Copy link

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 4, 2025
…40921)

faulthandler now detects if a frame or a code object is invalid or
freed.

Add helper functions:

* _PyCode_SafeAddr2Line()
* _PyFrame_SafeGetCode()
* _PyFrame_SafeGetLasti()

_PyMem_IsPtrFreed() now detects pointers in [-0xff, 0xff] range
as freed.
(cherry picked from commit a84181c)

Co-authored-by: Victor Stinner <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Nov 4, 2025

GH-140981 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label Nov 4, 2025
@bedevere-app
Copy link

bedevere-app bot commented Nov 4, 2025

GH-140985 is a backport of this pull request to the 3.13 branch.

vstinner added a commit to vstinner/cpython that referenced this pull request Nov 4, 2025
)

faulthandler now detects if a frame or a code object is invalid or
freed.

Add helper functions:

* _PyCode_SafeAddr2Line()
* _PyFrame_SafeGetCode()
* _PyFrame_SafeGetLasti()

_PyMem_IsPtrFreed() now detects pointers in [-0xff, 0xff] range
as freed.

(cherry picked from commit a84181c)
vstinner added a commit that referenced this pull request Nov 4, 2025
#140981)

gh-140815: Fix faulthandler for invalid/freed frame (GH-140921)

faulthandler now detects if a frame or a code object is invalid or
freed.

Add helper functions:

* _PyCode_SafeAddr2Line()
* _PyFrame_SafeGetCode()
* _PyFrame_SafeGetLasti()

_PyMem_IsPtrFreed() now detects pointers in [-0xff, 0xff] range
as freed.
(cherry picked from commit a84181c)

Co-authored-by: Victor Stinner <[email protected]>
vstinner added a commit that referenced this pull request Nov 5, 2025
…140985)

gh-140815: Fix faulthandler for invalid/freed frame (#140921)

faulthandler now detects if a frame or a code object is invalid or
freed.

Add helper functions:

* _PyCode_SafeAddr2Line()
* _PyFrame_SafeGetCode()
* _PyFrame_SafeGetLasti()

_PyMem_IsPtrFreed() now detects pointers in [-0xff, 0xff] range
as freed.

(cherry picked from commit a84181c)
StanFromIreland pushed a commit to StanFromIreland/cpython that referenced this pull request Dec 6, 2025
)

faulthandler now detects if a frame or a code object is invalid or
freed.

Add helper functions:

* _PyCode_SafeAddr2Line()
* _PyFrame_SafeGetCode()
* _PyFrame_SafeGetLasti()

_PyMem_IsPtrFreed() now detects pointers in [-0xff, 0xff] range
as freed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants