-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
bpo-29623: Make pathlib objects work with ConfigParser.read. #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow these steps to rectify the issue:
Thanks again to your contribution and we look forward to looking at it! |
berkerpeksag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update Doc/library/configparser.rst and add a note to Misc/NEWS. You can take a look at https://github.com/python/cpython/pull/157/files to see how to document this feature. Thanks!
Lib/configparser.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on why we need bytes here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong but my understanding is that bytes are a valid type for paths. The docs for Pathlib provide an operator to convert to bytes - https://docs.python.org/3/library/pathlib.html#operators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
configparser module doesn't support passing bytes objects at the moment. Accepting bytes(some_pathlib_object) would be a new feature and I think it's out of scope for this issue.
For example:
>>> import configparser as cp
>>> from test import support
>>> f = support.findfile("cfgparser.1")
>>> c = cp.ConfigParser()
>>> c.read(f.encode())
[]
>>> # If we pass ``[f.encode()]`` it will work.
>>> # However this doesn't documented anywhere in the docs.
>>> c.read([f.encode()])
[b'/home/berker/projects/cpython/master/Lib/test/cfgparser.1']There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mainly using the pathlib support as an example of support for bytestrings as paths. You can find similar checks in the compression libraries (lzma.py, gzip.py and bz2.py).
Having it work when in a list and not work when outside of a list is the same behaviour path objects currently have except OSError is already being caught while TypeError is not so it fails silently instead of loudly. It also attempts to open all the file descriptors with the same value as the individual characters which could in theory lead to some unusual behaviour.
A somewhat contrived example
>>> import os
>>> import configparser
>>> k = os.open('/home/david/develop/cpython/Lib/test/cfgparser.1', os.O_RDONLY)
>>> k
3
>>> c = configparser.ConfigParser()
>>> c.read(b'\x03')
[3]
>>> list(c)
['DEFAULT', 'Foo Bar']
>>> os.close(k) # by hitting the file descriptor for k it has actually closed the file already
OSError: [Errno 9] Bad file descriptor
Attempting to open(b'\x03') instead of putting it through the read function gives the expected FileNotFoundError.
I'm not sure how common it is to use bytes for paths but I think it makes most sense to add both and document both (looking at it there should probably be another test if so). Doing so makes the behaviour more consistent.
Either way I think the behaviour for bytes needs to be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but those checks were already there when we added os.PathLike support to lzma, gzip and other modules. See 5f59ddd#diff-0949b7e19fdea84affd443e4f7a0c6bfR117 for example. I'd suggest opening another issue to discuss this on bugs.python.org.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see what you mean, I do think both things need to be addressed but it does make sense to split them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a new bug report here:
http://bugs.python.org/issue29627
I think it may be that this fix should come after that fix as I'm guessing it may affect python versions without pathlib? (If support for bytes is added).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I've added the module maintainer to that issue.
Actually I want to get this PR in for 3.6.1 which is due this weekend :)
With or without pathlib support, accepting bytes objects is a new feature and it cannot be added to bugfix releases.
|
I've removed the check for bytes. I'll have a chance to edit the Docs either this evening or on Friday. I hadn't realised the 3.6.1 deadline was this weekend! |
|
I've added some documentation in the places you mentioned. I hope I've done so correctly. I've also added 2 paragraph breaks that I think make it a bit more readable as it was originally one fairly large paragraph (which also made it difficult to edit while keeping under 80 characters - unless there's some tool that's normally used to handle this). I have signed the CLA but I only did so shortly before submitting the original PR. |
|
LGTM. Thank you for your work on this PR, @DavidCEllis. I made some tweaks to |
|
Ah, yeah I see I put that in the wrong place now. Thanks. |
Lib/test/test_configparser.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cf.read() should return a list of str here. For example, in bpo-28231 we decided that the zf.filename attribute should return str even if a Path() object is passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment the behaviour is consistent with the current behaviour if the Path object was passed as part of a list. Making it return a string would involve changing this behaviour too if we're being consistent. We would also have to be careful not to break things if anybody's passing bytes paths in as part of a list.
I'm not certain it makes as much sense here as it does in zipfiles as the config file doesn't contain the same kind of internal structure, are there other examples where it's being converted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain it makes as much sense here as it does in zipfiles as the config file doesn't contain the same kind of internal structure, are there other examples where it's being converted?
The only other example where the file path is part of the public API is gzip module and it's converted to str by using os.fspath(): https://hg.python.org/cpython/rev/b244bf74b638
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those methods also correctly handle bytestring paths. If you change line 700 to use os.fspath() and attempt to use a bytestring here, it will fail if one of the byte values for a character matches a file handle:
ConfigParser().read(b'\x03')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/david/develop/cpython/Lib/configparser.py", line 700, in read
read_ok.append(os.fspath(filename))
TypeError: expected str, bytes or os.PathLike object, not int
You could add a check to prevent this but at that point you're catching an error when you already know Python is doing the wrong thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO changing line 692 from
filenames = [filenames]to
filenames = [os.fspath(filenames)]would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes ConfigParser().read(path) inconsistent with ConfigParser().read([path]) though.
>>> from pathlib import Path
>>> from configparser import ConfigParser
>>> ConfigParser().read(Path('Lib/test/cfgparser.1'))
['Lib/test/cfgparser.1']
>>> ConfigParser().read([Path('Lib/test/cfgparser.1')])
[PosixPath('Lib/test/cfgparser.1')]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Anyway I'm not too worried about bytes paths at the moment. We can fix all corner cases if we decide to support bytes in 3.7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only noticed bytes paths did weird things when checking to see why Path objects were failing. It's not something I actually make use of personally but the behaviour was unexpected. My only real argument for supporting bytes paths is that you already do so in lists.
I'd err on the side of not changing current (working) behaviour and leave the path objects as is.
Formatting consistency in test.
…dded paragraph breaks to aid readability.
(cherry picked from commit 9380f00)
Fix whitespace and markup in the Stackless manual (cherry picked from commit ca6a4b3)
Fix whitespace, mostly replace tab by spaces. (cherry picked from commit 2e5f7b5)
Fix whitespace and markup in the Stackless manual
Fix whitespace, mostly replace tab by spaces.
Patchcheck added to many spaces.
_legacy: fix Resource type hint
Checks for os.PathLike and bytes along with str before attempting to iterate.