os: implement readlink on linux+mac.#26405
Merged
Merged
Conversation
09ff90a to
c09bc90
Compare
JalonSolov
reviewed
Jan 21, 2026
JalonSolov
reviewed
Jan 21, 2026
gechandesu
reviewed
Jan 21, 2026
Delta456
approved these changes
Jan 25, 2026
Contributor
|
@warpfork thank you 🙇🏻 . I've investigated the msvc failure, but could not reach the root cause (I could only use the CI for feedback, since I do not have access to a windows machine currently and for the next week or so). => For now, I've skipped the second test too for msvc, since it is not directly relevant to the new wrapper. I think that your implementation is good to merge right after the current tests pass. |
spytheman
approved these changes
Jan 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement readlink on linux (and mac, I think).
The guts:
Like many other functions in the os module, this uses libc.
However, it may also look a little more intricate than expected:
readlink has a rather unusual interface in that it writes to memory you give it,
but also doesn't actually tell you how much memory that it wants.
Using it correctly therefore involves guessing and checking, and then potentially retrying with a larger buffer.
I tried to get the fastest path in the most common cases.
The first shot is using an array on the stack to gather data; if it fits, we make a new string on the heap, copy the data over there, and return that: one syscall, one alloc, and the returned memory is nicely fitted. About as good as it can get.
If things don't fit on the first shot: the first try with a bigger buffer will still be one alloc in the end, but a second syscall.
After that, more attempts increase both cost by one.
The memory that's returned may also be slightly excessively sized (because that seems preferable to doing another alloc round to trim back down).
I don't think the code path with the growing buffer will be used often, if at all, but if I understand correctly, it is necessary to be fully generally correct.
In practice, I think the limit of the size of symlink content is going to be PATH_MAX on any reasonably normal kernel,
and thus fit into the buffer we use on the first shot (assuming that the program was compiled in an environment with the same PATH_MAX as is prevailing on the kernel the program is running on).
In terms of spec, though, I couldn't actually find anything that commits to that explicitly.
Man pages regarding readlink only talk about the usage pattern of retries with larger buffers, and don't talk about a hard max.
Therefore, some code to do this probing for size does seem to be called for.
(I've also checked around at what some other languages do for this, and seen a similar pattern elsewhere.)
(I manually tested that all the conditions work correctly if one replaces
max_path_buffer_sizethroughout the function with, say8. But I don't see a good way to make an automated test of it!)I believe I've written this such that it's safe and does not leak under either the GC mode or the
-autofreemode.Review on this is particularly welcome, though!
Platforms:
I've written this for linux/*nix (I believe it should work identically on mac) and put it in the file for that.
I'm not at all familiar with how this translates to windows, or if the concept actually has a direct equal at all.
I'm happy to take guidance on how to do conditionals for the platforms, though.
I also propose some documentation and naming changes to the symlink function, because I bumped up against that along my way and was briefly very confused.
The man pages for the
lncommand use the terminology "target" for the content of a symlink,and "link_name" for the path where the symlink will be created; I changed the vlib function definiton to use the same.
(Alternatively, the man pages for
symlinkin libc use "target" and "linkpath"; "linkpath" vs "link_name" is potato potato imo, but they do both agree that target is the thing pointed at.)Previously, vlib used "target" as the path where the link would occur, which is the opposite of how the man pages for both the
lncommand and libcsymlinkspeak.(I'm guessing that vlib ended up with the wording that it did because if someone wrote the (hard)
linkfunction first, then there, it makes a bit more sense... but I think using the same words as the man pages use is a better choice.)There's no functional change here; just naming.
I'm happy to rebase this back out if it's contentious, but I think it's probably not(?).
(I didn't propose any change to the names used by the (hard)
linkfunction, but fwiw,https://www.man7.org/linux/man-pages/man2/link.2.html would call those parameters "oldpath" and "newpath", and I think copying that naming may be a worthy confusion-minimizer too.)
Lastly, I added some documentation to the
os.existsfunction clarifying its behavior around symlinks (and a test for it).I had a few head-desk moments while writing my own tests before noticing this (imo odd, even if apparently libc-normal) behavior detail, so it seemed worth a sentence or two.
(EDIT: this is getting weirder and weirder as I look at what windows does via the CI runs, so it's turning into more than a sentence or two... )