-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
bpo-30228: TextIOWrapper uses abs_pos, not tell() #1385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@Haypo, thanks for your PR! By analyzing the history of the files in this pull request, we identified @benjaminp, @loewis and @serhiy-storchaka to be potential reviewers. |
This change means that TextIOWrapper becomes inconsistent if the file descriptor is moved direcly using os.lseek()... but BufferedReader/BufferedWriter don't detect neither when the file descriptor is moved directly, no? I mean, abs_pos cached attribute already has the bug, no? |
|
@pitrou: Would you mind to review this one? Does it look like an acceptable optimization? |
|
@serhiy-storchaka: Same questions. Would you mind to review this one? Does it look like an acceptable optimization? |
|
@Haypo: this looks like an acceptable optimization, but the question is whether it brings any significant speedup. |
Modules/_io/textio.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it's not seekable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only go into this path if self->seekable is set.
Modules/_io/textio.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about BufferedReader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only go into this path if self->encoder is set. self->encoder is only set if the buffer is writable.
|
@pitrou: I replied to your comments. So, what do you think? Is my PR safe? |
|
As I said above : this looks like an acceptable optimization, but the question is whether it brings any significant speedup :-) Of course I'm not against making optimizations in buffered I/O. I also know that buffering can be tricky (being responsible for the data corruption issue in Python 3.2 made me cautious about this!). But if this can increase performance significantly then ok (and you'll bear the responsability of any regression ;-)). |
The TextIOWrapper constructor now gets directly the private abs_pos attribute of BufferedWriter and BufferedRandom instead of calling the tell() method to avoid one lseek() syscall on open(fname, "w") and open(fname, "w+"). Move the buffered structure to _iomodule.h and rename it to _PyIO_buffered. Add also "pythread.h" to _iomodule.h, needed by _PyIO_buffered lock.
|
It's hard to see any significant speedup, so I abandon my change. The risk is not worth it. |
The TextIOWrapper constructor now gets directly the abs_pos attribute
of BufferedWriter and BufferedRandom instead of calling the tell()
method to avoid one lseek() syscall on open(fname, "w") and
open(fname, "w+").
Move the buffered structure to _iomodule.h and rename it to
_PyIO_buffered. Add also "pythread.h" to _iomodule.h, needed by
_PyIO_buffered lock.
https://bugs.python.org/issue30228