-
Notifications
You must be signed in to change notification settings - Fork 3.8k
unix: allow nbufs>IOV_MAX in uv_fs_read and uv_fs_write #269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
uv_fs_read and uv_fs_write are looped until all buffers are written. Fixes: #260
src/unix/core.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to pry IOV_MAX out of glibc, right? I'm not sure if that's a good idea because it used to be lower than the current limit of 1024. IIRC, it was something like 64.
That might become a problem if someone compiles libuv against headers that are newer than the kernel it runs on. I think that 2.6.9 (the oldest kernel we support) is unaffected by that but I'd rather err on the side of caution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot about _SC_IOV_MAX, I will remove those lines.
However I still think it's better to use libc's IOV_MAX than an internal hardcoded value.
Several textual adjustments in order to adhere to the libuv styleguide.
Removed __need_IOV_MAX in favor of _SC_IOV_MAX
src/unix/fs.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After mulling it over, I think it's best to ignore the error and indicate a partial write if total > 0. That way data doesn't get lost and if the error isn't temporal, subsequent reads or writes will signal it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring the errors may lead to partially written files and partially filled buffers, but since no error has been reported a user could think the data is actually valid.
If we notify the users about the error, they can at least document the error and retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Callers should already be prepared to deal with partial reads and writes so that should not be an issue.
The problem with retrying on error is that you don't know how much data has been sent. With reads, it's easy to miss that the contents of your receive buffers may have been clobbered and that's a potential security liability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get it, I have changed it...
However I read that these functions need to be atomic, maybe a loop isn't the best solution?
* Fixed several style issues * When an error occurs, a partial error is indicated (a followup read, will indicate a persistent error) * Included offset == 0 in the iteration * Adjusted to test to verify the offset fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. Looking at this again, I suspect there is a bug here: it assumes that all the buffers have been consumed but partial reads and writes can (and do) happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this they're atomic. I quote:
Buffers are processed in array order. This means that readv() completely fills iov[0] before proceeding to iov[1], and so on. (If there is insufficient data, then not all buffers pointed to by iov may be filled.) Similarly, writev() writes out the entire contents of iov[0] before proceeding to iov[1], and so on.
The data transfers performed by readv() and writev() are atomic: the data written by writev() is written as a single block that is not intermingled with output from writes in other processes (but see pipe(7) for an exception); analogously, readv() is guaranteed to read a contiguous block of data from the file, regardless of read operations performed in other threads or processes that have file descriptors referring to the same open file description (see open(2)).
Is that good enough? Alternatively, should that not be a responsibility of uv__fs_write itself?
|
This PR really needs to be rebased and brought back to life. @bwijen could you address the last issues addressed by @bnoordhuis? If there is any way I can help, let me know please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @bnoordhuis will want spaces after the semicolons here.
|
@saghul in the code comment discussions, my main point is that we should aim for the actual system limit, not some arbitrarily low value. most people don't know how to tune their system and if we could help them a little bit there I think those end users would be more satisfied overall |
I don't disagree! I'm just wondering what do you mean by set the system limits here. I just checked and OSX also seems to have IOV_MAX set to 1024, so we'll do writev syscalls of at most 1024 buffers, otherwise batch them in groups of 1024 buffers. I might be missing your point, can you please elaborate on how you'd raise the limit beyond that? |
|
@saghul forgive me, this configurable system limit does not appear easily configureable at runtime. You are absolutely right that it is not libuv's responsibility to set |
|
@reqshark thanks for confirming! I was afraid I was missing something, now we know! :-) |
|
@ronkorving has picked this up in issue #448 so I'm closing this one here now... |
This allows writing and reading any amount of buffers, regardless of what IOV_MAX may be defined as. It also moves the IOV_MAX test from stream to core. This is based on the excellent work of @bwijen in #269. Refs: #269 PR-URL: #448 Reviewed-By: Saúl Ibarra Corretgé <[email protected]>
This allows writing and reading any amount of buffers, regardless of what IOV_MAX may be defined as. It also moves the IOV_MAX test from stream to core. This is based on the excellent work of @bwijen in libuv#269. Refs: libuv#269 PR-URL: libuv#448 Reviewed-By: Saúl Ibarra Corretgé <[email protected]>
This allows writing and reading any amount of buffers, regardless of what IOV_MAX may be defined as. It also moves the IOV_MAX test from stream to core. This is based on the excellent work of @bwijen in libuv#269. Refs: libuv#269 PR-URL: libuv#448 Reviewed-By: Saúl Ibarra Corretgé <[email protected]>
uv_fs_read and uv_fs_write are looped until all buffers are written.
Fixes: #260