-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
gh-83714: Implement os.statx #139178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-83714: Implement os.statx #139178
Conversation
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Here is a first review.
Co-authored-by: Victor Stinner <[email protected]>
Co-authored-by: Victor Stinner <[email protected]>
|
In addition to making the changes you directed, when modifying the
and |
|
I'm not fully comfortable with having some os.statx() tests in test_posix and some others in test_os. But it seems like you reused existing code in test_posix and test_os, so I would say that I'm fine with it. We might merge test_posix and test_os into a single test, but that's a different topic. |
I created #139322 for that :-) |
|
Do you still plan to merge |
|
The plan to split test_os is stale for now. You can rebase your PR. |
Co-authored-by: Victor Stinner <[email protected]>
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the change LGTM. There is just one last open question of adding a default value to the mask parameter or not.
cmaloney
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall really like this. I don't have a strong preference around the defaults. I think it would be good to have a documentation expert check the new docs over.
|
This API is uneasy to use. It took me a while to understand To make the implementation simpler, I suppose that we should leave |
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but @gpshead asked to convert the flags parameter to a keyword-only parameter.
Modules/posixmodule.c
Outdated
| } | ||
| } | ||
| else { | ||
| pystatx_result_spec.name = "os.statx_result"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using os.statx_result name directly in pystatx_result_spec? (line 3505)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I was just following the pattern of other types created in posixmodule_exec, most of which assign their name here. The tests pass using os.statx_result directly in the spec, so I'll make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. That's a strange pattern 😄 I don't know why existing code does that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote #140168 to set type names earlier in posixmodule.c.
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you should also document the new os.statx() function by adding a new "os" section near: https://docs.python.org/dev/whatsnew/3.15.html#os-path (Doc/whatsnew/3.15.rst).
Yes, I did this.
Done. |
|
👍 looking good to me with the kwarg doc change added in. |
Co-authored-by: Cody Maloney <[email protected]>
|
I enabled auto-merge, don't touch anything 😁 |
|
|
|
|
Oh no! AMD64 CentOS9 NoGIL 3.x buildbot build fails with: test.pythoninfo: ARM Raspbian 3.x and ARM64 Raspbian Debug 3.x also fail to build with: test.pythoninfo: |
|
@vstinner I know what's wrong and how to fix it. I'll have a PR up later today. |
This PR implements
os.statx, an interface to the Linuxstatx(2)system call introduced in kernel version 4.11 (April 2017) and first available in glibc version 2.28 (August 2018). This is derived from my earlier PR #136334 with the changes toos.statremoved, plus other changes (see below).This PR implements the return value of
os.statxwith a custom C typestatx_result, which contains astruct statxand uses member and getset descriptors to lazily create Python objects on attribute access. In a comment on the previous PR, @vstinner suggested usingtypes.SimpleNamespaceas the return value. I implemented that on the statx-simplenamespace branch in my fork. Using the benchmarks from this script (note they are not all fair comparisons):It's slower than this PR when only size and mtime are requested, because it creates objects for the unconditionally valid members and both the float-seconds and int-nanoseconds timestamps. We could get some of this back by defining our own mask bits (bit 32 and above). As more bits are set in the mask and attributes are accessed, the gap widens. I'm not sure if that's due to dict resizing or slower attribute access or both, and I don't see how to improve either. The advantage of the namespace implementation is its simplicity, and any speed hacks would dilute that.
(Also, because it doesn't create unrequested members, it's not a perfect wrapper around the system call. For "real" use it doesn't matter, but for testing the syscall, you'll miss some quirks. For example, btrfs seems to always return atime and btime, but returns mtime and ctime (always together) only if mtime and/or ctime were requested.)
In this PR, most attributes are implemented with member descriptors pointing into the
struct statxor a member instatx_result, so each attribute access onstatx_resultcreates a new int or float object. My previous PR cached the created objects in thestatx_resultfor the commonly-used attributes to avoid creating them more than once, which requires getset descriptors. Obviously, creating objects has a cost, but getset descriptors are slower than member descriptors. The crossover point turns out to be about two accesses (per attribute). If an attribute is only used once, this PR is faster; twice, slightly faster or even; three or more times, the previous PR is faster. I decided the complexity of the cache wasn't worth it. There's a cleaned-up version of the caching implementation on my fork if you want to take your own measurements.(I also didn't test the cache in the free-threaded build; I think the atomics/locking is correct, but I don't really understand how critical sections that can be "suspended" can possibly be safe.)
📚 Documentation preview 📚: https://cpython-previews--139178.org.readthedocs.build/