src,http2: use native-layer pipe mechanism from FileHandle instead of synchronous I/O#18936
src,http2: use native-layer pipe mechanism from FileHandle instead of synchronous I/O#18936addaleax wants to merge 11 commits intonodejs:masterfrom
Conversation
e7b0227 to
f7b044e
Compare
src/node_file.cc
Outdated
There was a problem hiding this comment.
Why not std::unique_ptr?
std::unique_ptr<FileHandleReadWrap> req_wrap(FileHandleReadWrap::from_req(req));
The delete then happens automatically at the end of the block.
There was a problem hiding this comment.
@fhinkel Done! But mostly because moving towards always using smart pointers is good. In this case, I am not sure that it actually makes the code or ownership flow cleaner (so I’ve added a comment for that).
f7b044e to
83b72ad
Compare
|
IMHO files served in this way would always be in the kernel cache, and this is a very tiny read. Can you add a realistic benchmark, ideally in another PR so that we can use it as a baseline to compare the two implementations? Can you generate a flamegraph of before/after the change? It would be useful to understand the performance impact. |
src/node_file.cc
Outdated
There was a problem hiding this comment.
Is creating a function here ok from a performance point of view? This is not the typical style I've used when working with libuv.
There was a problem hiding this comment.
@mcollina This doesn’t create a new function in C++.
This is not the typical style I've used when working with libuv.
I think that’s just because a lot of our code that deals with libuv was written before we started using C++11.
That would be true for a lot of cases, yes, and it’s something I’m worried about; because I’d expect that as long as files are in the kernel cache, synchronous I/O is always going to be faster than async I/O.
Sure, but I’d be interested in the criteria we would use. I’m wondering whether something like a combination with the benchmarks that we have for network sockets would make sense here, i.e. sending one or more large files over multiple streams and requests in parallel, and measuring throughput rather than ops/sec.
Tbh, I’m not sure what tooling would be best suited for that, but I’m sure I can try. |
|
I would love to see more modern C++ in our libuv related code : 💓 |
83b72ad to
69a6201
Compare
|
Flame graph comparisons: |
69a6201 to
39cc17b
Compare
|
Thank you for this. It's going to take me a few days to be able to get completely through this. I'll try to have some feedback by Monday :) |
|
This looks great, @addaleax! In the middle of a major move so it's likely going to take me a little while to review. |
ed538b5 to
b880709
Compare
Yeah. :/ Still, I think following the general principle of “never blocking the event loop” is enough to make this worth it. Also, there’s decent perf gains in other categories, and there’s always the fact that the benchmark doesn’t represent a real-world server and we’ll just have to see how this works out in reality.
Yeah, it is: (Side note: I just about got the first prototype for a fast-path StreamBase-to-StreamBase |
82bdf8f fixed an issue by silently modifying the `start` option for the case when only `end` is passed, in order to perform reads from a specified range in the file. However, that approach does not work for non-seekable files, since a numeric `start` option means that positioned reads will be used to read data from the file. This patch fixes that, and instead ends reading after a specified size by adjusting the read buffer size. This way we avoid re-introducing the bug that 82bdf8f fixed, and align behaviour with the native file stream mechanism introduced in nodejs#18936 as well. Fixes: nodejs#19240 Refs: nodejs#18121
Put `HandleScope`s and `Context::Scope`s where they are used, and don’t create one for native stream callbacks automatically. This is slightly less convenient but means that stream listeners that don’t actually call back into JS don’t have to pay the (small) cost of setting these up. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
This enables accessing files using a more standard pattern. Once some more refactoring has been performed on the other existing `StreamBase` streams, this could also be used to implement `fs` streams in a more standard manner. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Add a `OnStreamWantsWrite()` event that allows streams to ask for more input data if they want some. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Add `AsyncScope` for cases where the async_hooks `before` and `after` callbacks should be called, to track async context, but no actual JS is called in between and we can therefore skip things like draining the microtask or `nextTick` queues. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Provide a way to create pipes between native `StreamBase` instances that acts more directly than a `.pipe()` call would. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
This resolves the issue of using synchronous I/O for `respondWithFile()` and `respondWithFD()`, and enables scenarios in which the underlying file does not need to be a regular file. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Requiring `respondWithFile()` to only work with regular files is an artificial restriction on Node’s side and has become unnecessary. Offsets or lengths cannot be specified for those files, but that is an inherent property of other file types. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
This allows V8 to avoid preparing a execution context
for the constructor, to give a (kinda) small but noticeable
perf gain.
Benchmarks (only this commit):
$ ./node benchmark/compare.js --new ./node --old ./node-master --filter net-c2s.js --set len=10 --set type=asc --runs 360 net | Rscript benchmark/compare.R
[01:15:27|% 100| 1/1 files | 720/720 runs | 1/1 configs]: Done
confidence improvement accuracy (*) (**) (***)
net/net-c2s.js dur=5 type='asc' len=10 *** 0.69 % ±0.31% ±0.41% ±0.53%
PR-URL: nodejs#18936
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
`requests = 1000000` took about 10 minutes per run for me and doesn’t seem to add much value on its own. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
The `StreamBase` interface changed, so that `OnStreamRead()` and `OnStreamAlloc()` are not guaranteed to be emitted in the same tick any more. This means that, while it isn’t causing any trouble right now, we should not assume that it’s safe to return a static buffer in the HTTP parser’s `OnStreamAlloc()` method. PR-URL: nodejs#18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
61a3619 to
d8eb113
Compare
|
Landed in 3ad7c1a...1ac1424 |
Put `HandleScope`s and `Context::Scope`s where they are used, and don’t create one for native stream callbacks automatically. This is slightly less convenient but means that stream listeners that don’t actually call back into JS don’t have to pay the (small) cost of setting these up. PR-URL: #18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
This enables accessing files using a more standard pattern. Once some more refactoring has been performed on the other existing `StreamBase` streams, this could also be used to implement `fs` streams in a more standard manner. PR-URL: #18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Add a `OnStreamWantsWrite()` event that allows streams to ask for more input data if they want some. PR-URL: #18936 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
|
Should this be backported to |
|
This depends on #18297, so I don’t think we can backport it. |
Hi everyone! :)
This PR resolves the issue of using synchronous I/O for file-backed HTTP/2 streams, by turning
FileHandles into (currently read-only)StreamBaseinstances, adding a generic mechanism to pipe from aStreamBaseinto anotherStreamBase(where currently the latter one needs to know that it wants to have data written to it, which is true for HTTP/2 streams).This could probably land as it is, but I’ve marked it as WIP because the code probably isn’t quite documented yet and I want to investigate more into the performance of these changes:
The
http2/respond-with-fdbenchmark shows a ~30 % (update: down to more, like, 20 %) performance drop for me. At least part of that is likely related to the way the benchmark is written: It performs tens or hundreds of reads for the same portion of the same short file in order to return data to the client. In particular, the kernel will always immediately have the data available and will never need to perform actual disk operations, so synchronous reads actually come with less overhead than asynchronous reads to begin with.In some way, that’s realistic: A lot of the time, data will already be in the kernel cache. On the other hand, Node has committed to avoiding synchronous I/O, and we can’t really know whether disk I/O will be slow or fast in advance.
(A lot of these assumptions are based on the fact than when I add a
nanosleep()call with 1 ns duration – which obviously takes longer than 1 ns, but it appears to be enough – to thepread64()syscall, the 30 % drop turns into a whopping 100 % perf win).In any case, I’d love to hear ideas and suggestions, both about the general approach and its performance here.
/cc @nodejs/http2 @jasnell @mcollina @apapirovski @Fishrock123
Commits:
http2: simplify timeout tracking
There’s no need to reset the chunk counter for every write.
src: make
FileHandlea (readonly)StreamBaseThis enables accessing files using a more standard pattern.
Once some more refactoring has been performed on the other existing
StreamBasestreams, this could also be used to implementfsstreams in a more standard manner.
src: give StreamBases the capability to ask for data
Add a
OnStreamWantsWrite()event that allows streams toask for more input data if they want some.
src: introduce native-layer stream piping
Provide a way to create pipes between native
StreamBaseinstancesthat acts more directly than a
.pipe()call would.http2: use native pipe instead of synchronous I/O
This resolves the issue of using synchronous I/O for
respondWithFile()andrespondWithFD(), and enablesscenarios in which the underlying file does not need
to be a regular file.
http2: remove regular-file-only restriction
Requiring
respondWithFile()to only work with regular filesis an artificial restriction on Node’s side and has become unnecessary.
Offsets or lengths cannot be specified for those files,
but that is an inherent property of other file types.
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passesAffected core subsystem(s)
src,http2
CI: https://ci.nodejs.org/job/node-test-commit/16439/CI: https://ci.nodejs.org/job/node-test-commit/16441/CI: https://ci.nodejs.org/job/node-test-commit/16447/