Skip to content

Conversation

@izbyshev
Copy link
Contributor

@izbyshev izbyshev commented Mar 6, 2018

Retry GetFinalPathNameByHandleW in a loop instead of assuming that the resulting path won't change between the calls.

https://bugs.python.org/issue33016

Alexey Izbyshev added 2 commits March 7, 2018 01:34
…athname

Retry GetFinalPathNameByHandleW in a loop instead of assuming that
the resulting path won't change between the calls.
Copy link
Member

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let me know what you think about the GetLastError capture - I don't remember 100% if what I said is correct.

if (hFile == INVALID_HANDLE_VALUE) {
err = "CreateFileW";
goto done1;
return win32_error_object("CreateFileW", path->object);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misremembering this, but I'm fairly sure we need to capture GetLastError within Py_END_ALLOW_THREADS to avoid losing the information if the lock acquisition overwrites it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I questioned that a while ago and searched Py_END_ALLOW_THREADS in Modules/posixmodule.c. Every piece of code I looked at didn't bother to preserve the Windows last error across Py_END_ALLOW_THREADS, so I assumed it's OK.

Now that I've actually checked, Py_END_ALLOW_THREADS does make an effort to preserve POSIX errno, but it does nothing special for Windows. And the Windows last error may change in that macro, for example, because take_gil calls SleepConditionVariableSRW which may timeout and is documented to set the last error to ERROR_TIMEOUT in this case. The only code that cares about the last error is TLS-related (in Python/thread_nt.h), but despite the comments there (e.g. PyThread_tss_get) suggest that the code preserves the last error specifically for Py_END_ALLOW_THREADS, it's clearly not enough.

Should I change this PR to avoid losing the last error in nt._getfinalpathname only, or is it better to address the issue globally in a separate PR by changing Py_END_ALLOW_THREADS macro internals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: I've rechecked, and it turns out that Python unconditionally uses older XP+ synchronization primitives instead of Vista+ primitives (introduced in e75ff35), so SleepConditionVariableSRW is not called. The set of WinAPIs called from Py_END_ALLOW_THREADS and not wrapped with last-error-preservation code consists of EnterCriticalSection, LeaveCriticalSection, WaitForSingleObjectEx and ReleaseSemaphore. None of them are documented to preserve the last error on success, but at least WaitForSingleObjectEx reports timeout via its return value instead of the last error. So, while there might be a chance that Py_END_ALLOW_THREADS doesn't clobber the last error in practice, I'd still not rely on this and fix the macro.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for the thoroughness! Nothing blocking this patch, so I'll merge it and get the backports started and file a new issue (or resurrect an old one) about preserving the error.

(FWIW, all Win32 APIs should preserve the error value on success, or they have a bug, and all are allowed and should overwrite the last error value on failure. And we currently aren't using the new primitives because of issues whenever we enable them - there are issues on bpo about this if you wanted to look them up.)

while (1) {
Py_BEGIN_ALLOW_THREADS
result_length = GetFinalPathNameByHandleW(hFile, target_path,
buf_size, VOLUME_NAME_DOS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangential discussion:

Because we only try VOLUME_NAME_DOS, this function always fails for a volume that's not mounted as either a drive letter or a junction mount point. The error in this case is ERROR_PATH_NOT_FOUND. We know that the path is valid because we have a handle (in this case the file system ensures that no parent directory in the path can be unlinked or renamed). To address this, if the flags parameter isn't already VOLUME_NAME_GUID, it could switch to it and continue the while loop (no need to realloc). Otherwise if it's already VOLUME_NAME_GUID or for any other error, the call should fail.

For DOS devices in paths (e.g. "C:\Temp\NUL"), CreateFile commonly succeeds, but GetFinalPathNameByHandle commonly fails with either ERROR_INVALID_FUNCTION or ERROR_INVALID_PARAMETER. What do you think about handling this failure by calling GetFullPathName instead (e.g. "C:\Temp\NUL" => "\\.\NUL")? That way most DOS device paths won't cause this function to fail (e.g. when called by pathlib's resolve method). We could do the same if CreateFile fails with ERROR_INVALID_PARAMETER, which should only occur for CON (e.g. "C:\Temp\CON") because it needs to be opened with either generic read or generic write access.

CreatFile also commonly fails here with either ERROR_SHARING_VIOLATION (from a paging file) or ERROR_ACCESS_DENIED. But I think those are best handled by the caller, with a PermissionError exception handler. Currently pathlib's resolve method doesn't handle PermissionError like I think it should in non-strict mode. It only handles FileNotFoundError.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good points. Tangential discussions are better kept on the issue tracker though, as nobody will come look at a merged PR "just in case" something interesting was suggested :)

@zooba zooba merged commit 3b20d34 into python:master Mar 8, 2018
@miss-islington
Copy link
Contributor

Thanks @izbyshev for the PR, and @zooba for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7.
🐍🍒⛏🤖

@bedevere-bot
Copy link

@zooba: Please replace # with GH- in the commit message next time. Thanks!

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Mar 8, 2018
…athname (pythonGH-6010)

(cherry picked from commit 3b20d34)

Co-authored-by: Alexey Izbyshev <[email protected]>
@bedevere-bot
Copy link

GH-6031 is a backport of this pull request to the 3.7 branch.

@miss-islington
Copy link
Contributor

Sorry, @izbyshev and @zooba, I could not cleanly backport this to 3.6 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 3b20d3454e8082e07dba93617793de5dc9237828 3.6

@bedevere-bot
Copy link

GH-6032 is a backport of this pull request to the 3.6 branch.

miss-islington added a commit that referenced this pull request Mar 8, 2018
…athname (GH-6010)

(cherry picked from commit 3b20d34)

Co-authored-by: Alexey Izbyshev <[email protected]>
@izbyshev izbyshev deleted the bpo-33016 branch March 8, 2018 17:11
jo2y pushed a commit to jo2y/cpython that referenced this pull request Mar 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants