GH-140643: Add `<native>` and `<GC>` frames to the sampling profiler #141108

brandtbucher · 2025-11-06T03:51:53Z

Example flamegraph from one of the tests:

Issue: Show calls to native code and GC in profiles #140643

📚 Documentation preview 📚: https://cpython-previews--141108.org.readthedocs.build/

… in the sampling profiler - Introduce a new field in the GC state to store the frame that initiated garbage collection. - Update RemoteUnwinder to include options for including "<native>" and "<GC>" frames in the stack trace. - Modify the sampling profiler to accept parameters for controlling the inclusion of native and GC frames. - Enhance the stack collector to properly format and append these frames during profiling. - Add tests to verify the correct behavior of the profiler with respect to native and GC frames, including options to exclude them.

pablogsal · 2025-11-08T18:37:30Z

Seems that there is either some form of a race of somehow the windows test don't trigger the GC:

Profile Stats:
       nsamples   sample%  tottime (ms)    cumul%   cumtime (s)  filename:lineno(function)
     1510/36080      63.0       151.000    1505.8         3.608  tmpiafjtexk:10(slow_fibonacci)
        0/26017       0.0         0.000    1085.9         2.602  ~:0(<native>)
         0/2395       0.0         0.000     100.0         0.240  _sync_coordinator.py:227(main)
         0/2395       0.0         0.000     100.0         0.240  _sync_coordinator.py:244(<module>)
         0/2395       0.0         0.000     100.0         0.240  runpy.py:88(_run_code)
         0/2395       0.0         0.000     100.0         0.240  runpy.py:198(_run_module_as_main)
         1/2374       0.0         0.100      99.1         0.237  _sync_coordinator.py:186(_execute_script)
         0/2370       0.0         0.000      98.9         0.237  tmpiafjtexk:50(<module>)
         0/2287       0.0         0.000      95.5         0.229  tmpiafjtexk:44(main_loop)
        658/658      27.5        65.800      27.5         0.066  tmpiafjtexk:5(slow_fibonacci)
          91/91       3.8         9.100       3.8         0.009  tmpiafjtexk:7(slow_fibonacci)
           0/83       0.0         0.000       3.5         0.008  tmpiafjtexk:43(main_loop)
          39/39       1.6         3.900       1.6         0.004  tmpiafjtexk:16(cpu_intensive_work)
          35/35       1.5         3.500       1.5         0.004  tmpiafjtexk:17(cpu_intensive_work)
          29/29       1.2         2.900       1.2         0.003  tmpiafjtexk:8(slow_fibonacci)

Legend:
  nsamples: Direct/Cumulative samples (direct executing / on call stack)
  sample%: Percentage of total samples this function was directly executing
  tottime: Estimated total time spent directly in this function
  cumul%: Percentage of total samples when this function was on the call stack
  cumtime: Estimated cumulative time (including time in called functions)
  filename:lineno(function): Function location and name

Summary of Interesting Functions:

Functions with Highest Direct/Cumulative Ratio (Hot Spots):
  1.000 direct/cumulative ratio, 3.1% direct samples: tmpiafjtexk:(cpu_intensive_work)
  0.062 direct/cumulative ratio, 95.5% direct samples: tmpiafjtexk:(slow_fibonacci)
  0.000 direct/cumulative ratio, 0.0% direct samples: _sync_coordinator.py:(_execute_script)

Functions with Highest Call Frequency (Indirect Calls):
  34570 indirect calls, 1538.3% total stack presence: tmpiafjtexk:(slow_fibonacci)
  26017 indirect calls, 1085.9% total stack presence: ~:(<native>)
  2395 indirect calls, 100.0% total stack presence: _sync_coordinator.py:(main)

Functions with Highest Call Magnification (Cumulative/Direct):
  2374.0x call magnification, 2373 indirect calls from 1 direct: _sync_coordinator.py:(_execute_script)
  16.1x call magnification, 34570 indirect calls from 2288 direct: tmpiafjtexk:(slow_fibonacci)
'8;6u

pablogsal · 2025-11-08T18:38:47Z

Another posibility is that the machines are too slow and we don't even get to run under the gc somehow?

Lib/test/test_profiling/test_sampling_profiler.py

pablogsal · 2025-11-08T18:44:34Z

Maybe slow_fibonacci is too slow? 😆

pablogsal · 2025-11-08T18:56:24Z

I am thinking that <native> it's useful but perhaps it's a bit noisy if you are not hunting for it? Should we default it to False?

Another idea is that maybe there is a C function in the stack maybe in another PR we can fetch the C function name and use that as the code?

pablogsal · 2025-11-08T19:26:43Z

I have pushed some new tests and fixes hopefully this does the trick

brandtbucher · 2025-11-12T06:14:45Z

The flakiness of these sorts of tests is... annoying. Quitting for the night.

pablogsal · 2025-11-12T11:43:10Z

The flakiness of these sorts of tests is... annoying. Quitting for the night.

I feel you. Unfortunately it's very hard to write correct code here as its fundamentally a race condition between the function being profiled and the profiler. Specially in slow machines it's a pain.

I recommend doing one thing and one thing only per test

pablogsal · 2025-11-12T11:45:55Z

@brandtbucher a suggestion if you struggled with CI it's to just add the GC switch in this PR and figure out native mode layer as that is currently less useful and it's giving us trouble.

brandtbucher · 2025-11-12T16:50:54Z

I think it's an ASan-specific thing (I can reproduce locally). I'll figure out what's going on later.

brandtbucher · 2025-11-13T14:43:37Z

I thought I was being clever when I also added support for native frames at the very top of the stack in a recent commit, but that only works on debug builds (where we clear the stack pointer upon resuming a Python frame). 🤦🏼‍♂️

Reverting, this version only finds native frames in the middle of the stack now.

pablogsal · 2025-11-13T16:59:07Z

I thought I was being clever when I also added support for native frames at the very top of the stack in a recent commit, but that only works on debug builds (where we clear the stack pointer upon resuming a Python frame). 🤦🏼‍♂️

Reverting, this version only finds native frames in the middle of the stack now.

Haha nice!

I assume this means that you prefer to go with GC + native in this PR then, no?

brandtbucher · 2025-11-13T17:01:28Z

Yeah, I’m happy with the current state. We can beef up the native features later if they’re worth the performance hit.

pablogsal

LGMT! Amazing work 💪

Left some small comments

Modules/_remote_debugging_module.c

pablogsal · 2025-11-17T13:01:49Z

Fixed some merge conflicts

bedevere-bot · 2025-11-17T13:54:26Z

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 FreeBSD14 3.x (tier-3) has failed when building commit 336366f.

What do you need to do:

Don't panic.
Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1232/builds/7240) and take a look at the build logs.
Check if the failure is related to this commit (336366f) or if it is a false positive.
If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1232/builds/7240

Failed tests:

test_profiling

Failed subtests:

test_process_pool_executor_pickle - test.test_profiling.test_sampling_profiler.TestProcessPoolExecutorSupport.test_process_pool_executor_pickle

Summary of the results of the build (if available):

==

Click to see traceback logs

Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.opsec-fbsd14/build/Lib/test/test_profiling/test_sampling_profiler.py", line 3354, in test_process_pool_executor_pickle
    self.assertIn("Results: [2, 4, 6]", stdout)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'Results: [2, 4, 6]' not found in ''

pablogsal · 2025-11-17T14:42:38Z

This is not in this PR I will take a look

pablogsal · 2025-11-18T01:42:24Z

Fixing in #141688 (comment)

…filer (python#141108) - Introduce a new field in the GC state to store the frame that initiated garbage collection. - Update RemoteUnwinder to include options for including "<native>" and "<GC>" frames in the stack trace. - Modify the sampling profiler to accept parameters for controlling the inclusion of native and GC frames. - Enhance the stack collector to properly format and append these frames during profiling. - Add tests to verify the correct behavior of the profiler with respect to native and GC frames, including options to exclude them. Co-authored-by: Pablo Galindo Salgado <[email protected]>

brandtbucher added 8 commits November 5, 2025 19:33

No need to use atomics here

6427b6e

Clean up the diff

3801924

Add docs

cb433c1

Whitespace

c26de8f

Whitespace

9ea2512

fixup

c06158b

blurb add

e831b33

brandtbucher self-assigned this Nov 6, 2025

brandtbucher requested review from ambv and pablogsal as code owners November 6, 2025 03:51

brandtbucher added the type-feature A feature request or enhancement label Nov 6, 2025

brandtbucher requested a review from 1st1 as a code owner November 6, 2025 03:51

brandtbucher added sprint interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Nov 6, 2025

github-project-automation bot added this to Sprint 2024 Nov 6, 2025

brandtbucher added the stdlib Standard Library Python modules in the Lib/ directory label Nov 6, 2025

github-project-automation bot moved this to Todo in Sprint 2024 Nov 6, 2025

bedevere-app bot added the awaiting core review label Nov 6, 2025

bedevere-app bot mentioned this pull request Nov 6, 2025

Show calls to native code and GC in profiles #140643

Closed

pablogsal reviewed Nov 8, 2025

View reviewed changes

Lib/test/test_profiling/test_sampling_profiler.py Outdated Show resolved Hide resolved

pablogsal force-pushed the native-gc-sampling branch from 4f09326 to e831b33 Compare November 8, 2025 19:08

pablogsal added 3 commits November 8, 2025 19:17

Separate tests

269fe68

Do not show artificial lines in GC frames or native

25e770b

Fix flamegraph

103835e

More time for ASan

6c2e2fb

Less native time

c31c6dc

Simplify the test

abf1337

Don't detect native frames at the top of the stack

0b54df2

pablogsal approved these changes Nov 13, 2025

View reviewed changes

github-project-automation bot moved this from Todo to In Progress in Sprint 2024 Nov 13, 2025

bedevere-app bot added awaiting merge and removed awaiting core review labels Nov 13, 2025

pablogsal reviewed Nov 13, 2025

View reviewed changes

Modules/_remote_debugging_module.c Show resolved Hide resolved

pablogsal reviewed Nov 13, 2025

View reviewed changes

Modules/_remote_debugging_module.c Show resolved Hide resolved

Merge upstream/main into native-gc-sampling

d23ca10

Clarify comment

329f549

pablogsal enabled auto-merge (squash) November 17, 2025 13:32

pablogsal merged commit 336366f into python:main Nov 17, 2025
46 checks passed

github-project-automation bot moved this from In Progress to Done in Sprint 2024 Nov 17, 2025

bedevere-app bot removed the awaiting merge label Nov 17, 2025

encukou mentioned this pull request Nov 17, 2025

GH-139914: Handle stack growth direction on HPPA #140028

Merged

tacaswell mentioned this pull request Dec 8, 2025

Upcoming incompatibility with CPython 3.15 python-greenlet/greenlet#481

Open

Uh oh!

GH-140643: Add <native> and <GC> frames to the sampling profiler #141108

GH-140643: Add <native> and <GC> frames to the sampling profiler #141108

Uh oh!

Conversation

brandtbucher commented Nov 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pablogsal commented Nov 8, 2025

Uh oh!

pablogsal commented Nov 8, 2025

Uh oh!

Uh oh!

pablogsal commented Nov 8, 2025

Uh oh!

pablogsal commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pablogsal commented Nov 8, 2025

Uh oh!

brandtbucher commented Nov 12, 2025

Uh oh!

pablogsal commented Nov 12, 2025

Uh oh!

pablogsal commented Nov 12, 2025

Uh oh!

brandtbucher commented Nov 12, 2025

Uh oh!

brandtbucher commented Nov 13, 2025

Uh oh!

pablogsal commented Nov 13, 2025

Uh oh!

brandtbucher commented Nov 13, 2025

Uh oh!

pablogsal left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pablogsal commented Nov 17, 2025

Uh oh!

Uh oh!

bedevere-bot commented Nov 17, 2025

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Uh oh!

pablogsal commented Nov 17, 2025

Uh oh!

pablogsal commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GH-140643: Add `<native>` and `<GC>` frames to the sampling profiler #141108

GH-140643: Add `<native>` and `<GC>` frames to the sampling profiler #141108

brandtbucher commented Nov 6, 2025 •

edited by github-actions bot

Loading

pablogsal commented Nov 8, 2025 •

edited

Loading

pablogsal left a comment •

edited

Loading