gh-98896: Fix parsing issue in resource_tracker to allow shared memory names containing colons #138473

rani-pinchuk · 2025-09-03T19:41:40Z

Shared memory names containing colons were not parsed correctly as the code of resource_tracker assumed that these names contain no colons.

@encukou

Issue: Resource tracker fails to track filenames with colons on Linux #98896

encukou

Hello! Sorry for the delay in reviewing.
The fix looks good for colons, but a comment on the issue also mentioned newlines in filenames. Do you want to tackle that, or leave it to another PR?

Lib/multiprocessing/resource_tracker.py

… the names

rani-pinchuk · 2025-11-05T21:23:10Z

Hi Petr, Thanks for your email (and also the one you sent later) - it is very kind of you. I try to address, indeed, the newlines - by encoding and decoding the shared_memory name. So a different approach than before. See the updated PR. Regards, Rani

…

On Tue, Nov 4, 2025 at 2:56 PM Petr Viktorin ***@***.***> wrote: ***@***.**** commented on this pull request. Hello! Sorry for the delay in reviewing. The lix looks good for colons, but a comment on the issue also mentioned *newlines* in filenames. Do you want to tackle that, or leave it to another PR? ------------------------------ In Lib/multiprocessing/resource_tracker.py <#138473 (comment)>: > + parts = line.strip().decode('ascii').split(':') + if len(parts) < 3: + raise ValueError("malformed resource_tracker message: %r" % (parts,)) + cmd = parts[0] + rtype = parts[-1] + name = ':'.join(parts[1:-1]) How about this? ⬇️ Suggested change - parts = line.strip().decode('ascii').split(':') - if len(parts) < 3: - raise ValueError("malformed resource_tracker message: %r" % (parts,)) - cmd = parts[0] - rtype = parts[-1] - name = ':'.join(parts[1:-1]) + cmd, *name_parts, rtype = line.strip().decode('ascii').split(':') + name = ':'.join(name_parts) — Reply to this email directly, view it on GitHub <#138473 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6O62VUX7TDUCM6PR6DYN333CV67AVCNFSM6AAAAACFRVHJJWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTIMJWGU4DAOBUGY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

encukou

Thank you!
This approach feels ad-hoc; did you consider a standard encoding like JSON lines or pickle?
(Python's JSON encoder won't output newlines unless you ask for indentation; pickle.load reads one object from a file and stops.)

But I can merge this if you don't want to explore those options.

encukou · 2025-11-06T08:48:40Z

Lib/multiprocessing/resource_tracker.py

-                    cmd, name, rtype = line.strip().decode('ascii').split(':')
+                    cmd, enc_name, rtype = line.rstrip(b'\n').decode('ascii').split(':', 2)
+                    if rtype == "shared_memory":
+                        name = base64.urlsafe_b64decode(enc_name.encode('ascii')).decode('utf-8', 'surrogateescape')


Decoding is OK with strings.

Suggested change

name = base64.urlsafe_b64decode(enc_name.encode('ascii')).decode('utf-8', 'surrogateescape')

name = base64.urlsafe_b64decode(enc_name).decode('utf-8', 'surrogateescape')

Thanks for your comment. I like your suggestion to consider using JSON to encode the message.

Also, in the original implementation there was a check that the message is not longer than 512 bytes, since writes shorter than PIPE_BUF (512 bytes on POSIX) are guaranteed to be atomic. But there was no check on the name length.

Now I check that the name is at most 255 bytes long, which is the value of NAME_MAX on Linux (including the leading slash that POSIX requires in shared memory and semaphore names).

I still encode the name using Base64, because json.dumps(..., ensure_ascii=True) would otherwise expand each non-ASCII byte into a 6-character escape like \uDC80. Using Base64 ensures that a 255-byte name becomes at most 340 bytes long, so the total JSON message always remains well below 512 bytes.

As a result, the previous runtime check for the message length is now replaced by an assert.

What do you think?

…the sent length below 512

encukou

Hm, good catch about still needing base64 for atomicity.
Hopefully, humans won't need to inspect the stream.

encukou · 2025-11-10T12:58:21Z

Lib/multiprocessing/resource_tracker.py

+                    line = raw.rstrip(b'\n')
+                    try:
+                        obj = json.loads(line.decode('ascii'))


Stripping and decoding shouldn't be needed; json.loads can handle bytes and trailing newlines.

encukou · 2025-11-10T12:59:29Z

Lib/multiprocessing/resource_tracker.py

+                    if not isinstance(cmd, str) or not isinstance(rtype, str) or not isinstance(b64, str):
+                        raise ValueError("malformed resource_tracker fields: %r" % (obj,))
+
+                    enc = b64.encode('ascii')


urlsafe_b64decode can handle bytes as well.

Lib/multiprocessing/resource_tracker.py

encukou · 2025-11-10T13:04:24Z

Lib/multiprocessing/resource_tracker.py

+                    cmd = obj.get("cmd")
+                    rtype = obj.get("rtype")
+                    b64  = obj.get("base64_name")


You don't need get with constant arguments:

Suggested change

cmd = obj.get("cmd")

rtype = obj.get("rtype")

b64 = obj.get("base64_name")

cmd = obj["cmd"]

rtype = obj["rtype"]

b64 = obj["base64_name"]

Alternately, make some fields optional -- then the PROBE message can be shorter:

Suggested change

cmd = obj.get("cmd")

rtype = obj.get("rtype")

b64 = obj.get("base64_name")

cmd = obj["cmd"]

rtype = obj["rtype"]

b64 = obj.get("base64_name", "")

bedevere-bot · 2025-11-11T12:45:15Z

🤖 New build scheduled with the buildbot fleet by @encukou for commit ea6416e 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F138473%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

encukou · 2025-11-12T12:41:01Z

Looks good! Thank you for the fix!

miss-islington-app · 2025-11-13T11:34:40Z

Thanks @rani-pinchuk for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington-app · 2025-11-13T11:34:40Z

Thanks @rani-pinchuk for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

StanFromIreland · 2025-11-24T21:26:29Z

Lib/test/_test_multiprocessing.py

        res = assert_python_failure("-c", code, PYTHONWARNINGS='error')
        self.assertIn(b'DeprecationWarning', res.err)
        self.assertIn(b'is multi-threaded, use of forkpty() may lead to deadlocks in the child', res.err)
+


Suggested change

This should have been a double line gap. I'll fix it in the backports.

…itrary shared memory names (pythonGH-138473) (cherry picked from commit c6f3dd6) Co-authored-by: Rani Pinchuk <[email protected]>

bedevere-app · 2025-11-24T21:28:13Z

GH-141922 is a backport of this pull request to the 3.14 branch.

… shared memory names (GH-138473) (GH-141922) Co-authored-by: Rani Pinchuk <[email protected]>

…itrary shared memory names (pythonGH-138473) (pythonGH-141922) (cherry picked from commit 64d6bde) Co-authored-by: Stan Ulbrych <[email protected]> Co-authored-by: Rani Pinchuk <[email protected]>

bedevere-app · 2025-11-27T12:22:13Z

GH-142014 is a backport of this pull request to the 3.13 branch.

… shared memory names (GH-138473) (GH-142014) (cherry picked from commit 64d6bde) Co-authored-by: Stan Ulbrych <[email protected]> Co-authored-by: Rani Pinchuk <[email protected]>

gpshead · 2025-12-03T04:23:32Z

this wound up causing some disruption in the stable branch backports - it is fine to keep this as is in main. but we may need to rework a protocol compatible solution in 3.14 and 3.13. #142206

…shared memory names (pythonGH-138473)

rani-pinchuk requested a review from gpshead as a code owner September 3, 2025 19:41

bedevere-app bot added the awaiting review label Sep 3, 2025

bedevere-app bot mentioned this pull request Sep 3, 2025

Resource tracker fails to track filenames with colons on Linux #98896

Closed

rani-pinchuk added 3 commits September 3, 2025 20:05

pythongh-98896: Fix parsing of registered or unregistered resources

ac9555d

Add test for shared memory names that contain colons

f3c8138

Fix the test

d973cf0

rani-pinchuk force-pushed the fix-shm-colon branch from 13ca553 to d973cf0 Compare September 3, 2025 20:13

📜🤖 Added by blurb_it.

e5f9549

encukou reviewed Nov 4, 2025

View reviewed changes

Lib/multiprocessing/resource_tracker.py Outdated Show resolved Hide resolved

Address also newlines in shared_memory names by encoding and decoding…

9ac85d8

… the names

encukou reviewed Nov 6, 2025

View reviewed changes

Sending the message as JSON and encode the name using base64 to keep …

84813da

…the sent length below 512

encukou reviewed Nov 10, 2025

View reviewed changes

Fixes according to review comments

ea6416e

encukou added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 11, 2025

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 11, 2025

encukou merged commit c6f3dd6 into python:main Nov 12, 2025
120 of 121 checks passed

bedevere-app bot removed the awaiting review label Nov 12, 2025

encukou added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Nov 13, 2025

This comment was marked as resolved.

Sign in to view

miss-islington-app bot assigned encukou Nov 13, 2025

This comment was marked as resolved.

Sign in to view

StanFromIreland removed needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Nov 24, 2025

StanFromIreland added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Nov 24, 2025

This comment was marked as resolved.

Sign in to view

StanFromIreland reviewed Nov 24, 2025

View reviewed changes

bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label Nov 24, 2025

encukou pushed a commit that referenced this pull request Nov 27, 2025

[3.14] gh-98896: resource_tracker: use json&base64 to allow arbitrary…

64d6bde

… shared memory names (GH-138473) (GH-141922) Co-authored-by: Rani Pinchuk <[email protected]>

bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Nov 27, 2025

mgorny mentioned this pull request Dec 3, 2025

#138473 backports to 3.13 & 3.14 break running programs on Python version upgrade (multiprocessing) #142206

Closed

StanFromIreland pushed a commit to StanFromIreland/cpython that referenced this pull request Dec 6, 2025

pythongh-98896: resource_tracker: use json&base64 to allow arbitrary …

81b6f83

…shared memory names (pythonGH-138473)

ndawe mentioned this pull request Dec 10, 2025

Python 3.13.10 or Python 3.14.1 only: ValueError: Cannot register "UNREGISTER","rtype":"semlock","base64_name" for automatic cleanup: unknown resource type joblib/loky#475

Closed

rashworld-max approved these changes Dec 20, 2025

View reviewed changes

bedevere-app bot added the awaiting core review label Dec 20, 2025

	name = base64.urlsafe_b64decode(enc_name.encode('ascii')).decode('utf-8', 'surrogateescape')
	name = base64.urlsafe_b64decode(enc_name).decode('utf-8', 'surrogateescape')

Uh oh!

Conversation

rani-pinchuk commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

encukou left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rani-pinchuk commented Nov 5, 2025 via email

Uh oh!

encukou left a comment

Choose a reason for hiding this comment

Uh oh!

encukou Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

rani-pinchuk Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

encukou left a comment

Choose a reason for hiding this comment

Uh oh!

encukou Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

encukou Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

encukou Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

encukou Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

bedevere-bot commented Nov 11, 2025

Uh oh!

Uh oh!

encukou commented Nov 12, 2025

Uh oh!

miss-islington-app bot commented Nov 13, 2025

Uh oh!

miss-islington-app bot commented Nov 13, 2025

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

StanFromIreland Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

bedevere-app bot commented Nov 24, 2025

Uh oh!

bedevere-app bot commented Nov 27, 2025

Uh oh!

gpshead commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

rani-pinchuk commented Sep 3, 2025 •

edited

Loading

encukou left a comment •

edited

Loading