Skip to content

Conversation

@bnoordhuis
Copy link
Member

@bnoordhuis bnoordhuis commented Apr 12, 2023

Add io_uring support for several asynchronous file operations:

  • read, write
  • fsync, fdatasync
  • stat, fstat, lstat

io_uring is used when the kernel is new enough, otherwise libuv simply
falls back to the thread pool.

Performance looks great; an 8x increase in throughput has been observed.

This work was sponsored by ISC, the Internet Systems Consortium.

@rektide
Copy link

rektide commented Apr 13, 2023

Also potentially resolves or progresses #1947.
Also: 🎉! 🚀!

Comment from the peanut gallery here, but, personally I'd be fine with/love/prefer an io_uring that targeted more modern io_urings. My understanding is 5.15 (2021.10.31) had a bunch of pretty sizable improvements to how folks could use io_uring. That's already nearly two years old, and is going to get more and more expectable. Meanwhile folks on older kernels (5.1 is 2019.5.5) still have the very fast existing libuv thread pool implementation to fall back on.

Maybe there's no sacrifice, but "a carefully chosen feature subset" leaves the impression that we're going way out of our way to support already pretty aged kernels. Grain of salt here, but being rip-raring future-forward-facing & adopting every whiz-bang shiny-new-whistle we can would be my pick at this juncture.

@trevnorris
Copy link
Member

Did some basic benchmarking, and it's giving me > 8x performance when reading small chunks out of /dev/zero. So great news on that side.

valgrind isn't happy about a few tests. Mostly with the error Syscall param io_uring_enter(sigsz) contains uninitialised byte(s). Simplest repro I could make is this:

#include <uv.h>

void on_read(uv_fs_t *req) { }

int main() {
  uv_fs_t read_req;
  uv_buf_t buffer;
  int fd;
  char bytes[64];

  buffer = uv_buf_init(bytes, 64);
  fd = uv_fs_open(NULL, &read_req, "/dev/zero", UV_FS_O_RDONLY, 0, NULL);
  uv_fs_read(uv_default_loop(), &read_req, fd, &buffer, 1, 0, on_read);
  uv_run(uv_default_loop(), UV_RUN_DEFAULT);
  uv_fs_close(NULL, &read_req, fd, NULL);

  uv_fs_req_cleanup(&read_req);
  uv_loop_close(uv_default_loop());
  return 0;
}

@bnoordhuis
Copy link
Member Author

personally I'd be fine with/love/prefer an io_uring that targeted more modern io_urings

Things may be heading that way out of necessity.

IORING_SETUP_SQPOLL requires linux v5.11 if you want to use it with non-fixed files (IIUC) but libuv can't do fixed files and without sqpoll it has to call io_uring_enter(IORING_ENTER_GETEVENTS) so often that any performance benefits are negated.

@bnoordhuis bnoordhuis marked this pull request as ready for review April 15, 2023 18:58
@bnoordhuis
Copy link
Member Author

This is ready for review now. Linux v5.13 is the minimum required, otherwise libuv falls back to the thread pool.

Copy link
Member

@santigimeno santigimeno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we document this change in the fs docs?

@bnoordhuis
Copy link
Member Author

bnoordhuis commented Apr 15, 2023

Should we document this change in the fs docs?

Yes. I thought it didn't say anything about the implementation but it does. I'll do that tomorrow.

Copy link
Member

@santigimeno santigimeno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (once docs are updated). This is awesome, thanks

@trevnorris
Copy link
Member

Still getting the following from valgrind:

fs_file_async

Syscall param io_uring_enter(sigsz) contains uninitialised byte(s)
   at 0x4993A3D: syscall (syscall.S:38)
   by 0x4F1366: uv__io_uring_enter (src/unix/linux.c:351)
   by 0x4F23A9: uv__iou_submit (src/unix/linux.c:644)
   by 0x4F24B5: uv__iou_fs_read_or_write (src/unix/linux.c:692)
   by 0x4DDAAC: uv_fs_write (src/unix/fs.c:2212)
   by 0x422F97: create_cb (test/test-fs.c:491)
   by 0x4DA38C: uv__fs_done (src/unix/fs.c:1779)
   by 0x4CDFD0: uv__work_done (src/threadpool.c:326)
   by 0x4D6433: uv__async_io (src/unix/async.c:176)
   by 0x4F2F95: uv__io_poll (src/unix/linux.c:1076)
   by 0x4D6B2D: uv_run (src/unix/core.c:417)
   by 0x42241C: run_test_fs_file_async (test/test-fs.c:915)

Copy link
Member

@trevnorris trevnorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than the valgrind error. The current code you have for incrementing events is fine enough since it basically follows what the threadpool is doing. It's a bug, but not one you should need to worry about. I'll take care of that in a future PR.

@bnoordhuis
Copy link
Member Author

FWIW, I don't get that warning with valgrind 3.18.1.

I'm 99% sure it's a valgrind bug because that sigsz it's complaining about is something that libuv doesn't use and doesn't exist in newer kernels.

On kernels where it was a thing libuv simply never calls io_uring_enter().

Add io_uring support for several asynchronous file operations:

- read, write
- fsync, fdatasync
- stat, fstat, lstat

io_uring is used when the kernel is new enough, otherwise libuv simply
falls back to the thread pool.

Performance looks great; an 8x increase in throughput has been observed.

This work was sponsored by ISC, the Internet Systems Consortium.
@bnoordhuis bnoordhuis merged commit d2c31f4 into libuv:v1.x Apr 18, 2023
@bnoordhuis bnoordhuis deleted the iou branch April 18, 2023 10:32
@trevnorris
Copy link
Member

trevnorris commented Apr 18, 2023

@bnoordhuis I'm also running valgrind 3.18.1. Upon further testing, the error only appears when I compile with clang (I tried all versions 11 to 15). It doesn't happen with gcc. Think it's a bug in clang or just a strange interaction with valgrind? Side note: need to compile with -gdwarf-4 when using clang for valgrind to work.

@bnoordhuis
Copy link
Member Author

Oh, that's interesting. Yes, I'm getting the same warning with clang.

gcc emits this code for the syscall wrapper:

   0x00000000000997f0 <+0>:     endbr64
   0x00000000000997f4 <+4>:     sub    $0x10,%rsp
   0x00000000000997f8 <+8>:     mov    %ecx,%r8d
   0x00000000000997fb <+11>:    xor    %r9d,%r9d
   0x00000000000997fe <+14>:    mov    %edx,%ecx
   0x0000000000099800 <+16>:    push   $0x0
   0x0000000000099802 <+18>:    mov    %esi,%edx
   0x0000000000099804 <+20>:    xor    %eax,%eax
   0x0000000000099806 <+22>:    mov    %edi,%esi
   0x0000000000099808 <+24>:    mov    $0x1aa,%edi
   0x000000000009980d <+29>:    call   0xe810 <syscall@plt>
   0x0000000000099812 <+34>:    add    $0x18,%rsp
   0x0000000000099816 <+38>:    ret

While clang emits this:

   0x00000000000a31f0 <+0>:     push   %rax
   0x00000000000a31f1 <+1>:     mov    %ecx,%r8d
   0x00000000000a31f4 <+4>:     mov    %edx,%ecx
   0x00000000000a31f6 <+6>:     mov    %esi,%edx
   0x00000000000a31f8 <+8>:     mov    %edi,%esi
   0x00000000000a31fa <+10>:    movl   $0x0,(%rsp)
   0x00000000000a3201 <+17>:    mov    $0x1aa,%edi
   0x00000000000a3206 <+22>:    xor    %r9d,%r9d
   0x00000000000a3209 <+25>:    xor    %eax,%eax
   0x00000000000a320b <+27>:    call   0xd780 <syscall@plt>
   0x00000000000a3210 <+32>:    pop    %rcx
   0x00000000000a3211 <+33>:    ret

I'm guessing that valgrind gets confused by that movl $0x0,(%rsp) for the sigsz argument somehow.

bnoordhuis added a commit to bnoordhuis/libuv that referenced this pull request Apr 19, 2023
Fix a valgrind warning that only manifested with clang (not gcc!) by
explicitly passing 0L instead of plain 0 as the |sigsz| argument to
io_uring_enter(). That is, pass a long instead of an int.

On x86_64, |sigsz| is passed on the stack (the other arguments are
passed in registers) but where gcc emits a `push $0` that zeroes the
entire stack slot, clang emits a `movl $0,(%rsp)` that leaves the upper
32 bits untouched.

It's academic though since we don't pass IORING_ENTER_EXT_ARG and the
kernel therefore completely ignores the argument.

Refs: libuv#3952
targos pushed a commit to nodejs/node that referenced this pull request Jun 4, 2023
- linux: introduce io_uring support libuv/libuv#3952
- src: add new metrics APIs libuv/libuv#3749
- unix,win: give thread pool threads an 8 MB stack libuv/libuv#3787
- win,unix: change execution order of timers libuv/libuv#3927

Fixes: #43931
Fixes: #42496
Fixes: #47715
Fixes: #47259
Fixes: #47241
PR-URL: #48078
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Mohammed Keyvanzadeh <[email protected]>
Reviewed-By: Debadree Chatterjee <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Jiawen Geng <[email protected]>
@GazHank
Copy link

GazHank commented Jul 19, 2023

Hi @bnoordhuis is there a simple way to test disabling this change per your previous suggestion of a UV_USE_IO_URING=0 environment variable or compilation setting?

I'm hoping to rule out this change as the cause of a defect, so being able to quickly enable / disable the functionality would be very helpful

@bnoordhuis
Copy link
Member Author

@GazHank UV_USE_IO_URING=0 still exists for now. If you're using a 5.x kernel: we landed some follow-up fixes, not all of them released yet.

@GazHank
Copy link

GazHank commented Jul 19, 2023

thanks @bnoordhuis yes, I'm seeing some issues on ubuntu 22.04.2 LTS ( 5.19.0-46-generic )

So far I've managed to confirm that the issues only show up when upgrading from libuv 1.44 to 1.45, but setting an environment variable of UV_USE_IO_URING=0 doesn't seem to make any difference. Perhaps it is a different change from 1.45 that is causing problems, but I don't see any other likely causes within the notables changes list (as windows is entirely unaffected).

Please let me know if there is an existing issue you would like me to add info to, or if you would prefer that I raise a new issue entirely. Of course if there are some draft changes you would like me to test to see if that solves the problem then please let me know, I'd be happy to help test the changes.

@bnoordhuis
Copy link
Member Author

You can try with the HEAD of the v1.x branch. If that doesn't fix it, it'd be helpful if you could run git bisect on the commits between 1.44 and 1.45 and open an issue with as much details as possible.

@GazHank
Copy link

GazHank commented Jul 22, 2023

Thanks @bnoordhuis Unfortunately I can confirm that it is this commit that causes the break :-(

Additionally, setting UV_USE_IO_URING=0 doesn't seem to make any difference, however amending the kernel version specified within linux.c file to a higher version than I'm running does fix/workaround the issue

libuv/src/unix/linux.c

Lines 431 to 443 in d09441c

use = atomic_load_explicit(&use_io_uring, memory_order_relaxed);
if (use == 0) {
/* Older kernels have a bug where the sqpoll thread uses 100% CPU. */
use = uv__kernel_version() >= /* 5.10.186 */ 0x050ABA ? 1 : -1;
/* But users can still enable it if they so desire. */
val = getenv("UV_USE_IO_URING");
if (val != NULL)
use = atoi(val) ? 1 : -1;
atomic_store_explicit(&use_io_uring, use, memory_order_relaxed);
}

As such, is it perhaps that UV_USE_IO_URING=1 can be use to enable the feature, but UV_USE_IO_URING=0 cannot be used to disable it?

I'll raise a new issues and try to include as much info as I can

@bnoordhuis
Copy link
Member Author

UV_USE_IO_URING=0 is a hard disable. You should be able to verify with strace that no io_uring system calls are made.

Ceres6 pushed a commit to Ceres6/node that referenced this pull request Aug 14, 2023
- linux: introduce io_uring support libuv/libuv#3952
- src: add new metrics APIs libuv/libuv#3749
- unix,win: give thread pool threads an 8 MB stack libuv/libuv#3787
- win,unix: change execution order of timers libuv/libuv#3927

Fixes: nodejs#43931
Fixes: nodejs#42496
Fixes: nodejs#47715
Fixes: nodejs#47259
Fixes: nodejs#47241
PR-URL: nodejs#48078
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Mohammed Keyvanzadeh <[email protected]>
Reviewed-By: Debadree Chatterjee <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Jiawen Geng <[email protected]>
Ceres6 pushed a commit to Ceres6/node that referenced this pull request Aug 14, 2023
- linux: introduce io_uring support libuv/libuv#3952
- src: add new metrics APIs libuv/libuv#3749
- unix,win: give thread pool threads an 8 MB stack libuv/libuv#3787
- win,unix: change execution order of timers libuv/libuv#3927

Fixes: nodejs#43931
Fixes: nodejs#42496
Fixes: nodejs#47715
Fixes: nodejs#47259
Fixes: nodejs#47241
PR-URL: nodejs#48078
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Mohammed Keyvanzadeh <[email protected]>
Reviewed-By: Debadree Chatterjee <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Jiawen Geng <[email protected]>
arsnyder16 pushed a commit to arsnyder16/node that referenced this pull request Sep 10, 2023
- linux: introduce io_uring support libuv/libuv#3952
- src: add new metrics APIs libuv/libuv#3749
- unix,win: give thread pool threads an 8 MB stack libuv/libuv#3787
- win,unix: change execution order of timers libuv/libuv#3927

Fixes: nodejs#43931
Fixes: nodejs#42496
Fixes: nodejs#47715
Fixes: nodejs#47259
Fixes: nodejs#47241
PR-URL: nodejs#48078
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Mohammed Keyvanzadeh <[email protected]>
Reviewed-By: Debadree Chatterjee <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Jiawen Geng <[email protected]>
ruyadorno pushed a commit to nodejs/node that referenced this pull request Sep 11, 2023
- linux: introduce io_uring support libuv/libuv#3952
- src: add new metrics APIs libuv/libuv#3749
- unix,win: give thread pool threads an 8 MB stack libuv/libuv#3787
- win,unix: change execution order of timers libuv/libuv#3927

Fixes: #43931
Fixes: #42496
Fixes: #47715
Fixes: #47259
Fixes: #47241
PR-URL: #48078
Backport-PR-URL: #49591
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Mohammed Keyvanzadeh <[email protected]>
Reviewed-By: Debadree Chatterjee <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Jiawen Geng <[email protected]>
ruyadorno pushed a commit to nodejs/node that referenced this pull request Sep 13, 2023
- linux: introduce io_uring support libuv/libuv#3952
- src: add new metrics APIs libuv/libuv#3749
- unix,win: give thread pool threads an 8 MB stack libuv/libuv#3787
- win,unix: change execution order of timers libuv/libuv#3927

Fixes: #43931
Fixes: #42496
Fixes: #47715
Fixes: #47259
Fixes: #47241
PR-URL: #48078
Backport-PR-URL: #49591
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Mohammed Keyvanzadeh <[email protected]>
Reviewed-By: Debadree Chatterjee <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Jiawen Geng <[email protected]>
ruyadorno pushed a commit to nodejs/node that referenced this pull request Sep 17, 2023
- linux: introduce io_uring support libuv/libuv#3952
- src: add new metrics APIs libuv/libuv#3749
- unix,win: give thread pool threads an 8 MB stack libuv/libuv#3787
- win,unix: change execution order of timers libuv/libuv#3927

Fixes: #43931
Fixes: #42496
Fixes: #47715
Fixes: #47259
Fixes: #47241
PR-URL: #48078
Backport-PR-URL: #49591
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Mohammed Keyvanzadeh <[email protected]>
Reviewed-By: Debadree Chatterjee <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Jiawen Geng <[email protected]>
@wswsmao
Copy link

wswsmao commented Mar 6, 2024

trevnorris

Did some basic benchmarking, and it's giving me > 8x performance when reading small chunks out of /dev/zero. So great news on that side.

valgrind isn't happy about a few tests. Mostly with the error Syscall param io_uring_enter(sigsz) contains uninitialised byte(s). Simplest repro I could make is this:

#include <uv.h>

void on_read(uv_fs_t *req) { }

int main() {
  uv_fs_t read_req;
  uv_buf_t buffer;
  int fd;
  char bytes[64];

  buffer = uv_buf_init(bytes, 64);
  fd = uv_fs_open(NULL, &read_req, "/dev/zero", UV_FS_O_RDONLY, 0, NULL);
  uv_fs_read(uv_default_loop(), &read_req, fd, &buffer, 1, 0, on_read);
  uv_run(uv_default_loop(), UV_RUN_DEFAULT);
  uv_fs_close(NULL, &read_req, fd, NULL);

  uv_fs_req_cleanup(&read_req);
  uv_loop_close(uv_default_loop());
  return 0;
}

Hi, @trevnorris , I want to evaluate the performance improvements brought by this feature. I was wouldering is there any performance benchmarking available?

@wswsmao
Copy link

wswsmao commented Mar 12, 2024

trevnorris

Did some basic benchmarking, and it's giving me > 8x performance when reading small chunks out of /dev/zero. So great news on that side.
valgrind isn't happy about a few tests. Mostly with the error Syscall param io_uring_enter(sigsz) contains uninitialised byte(s). Simplest repro I could make is this:

#include <uv.h>

void on_read(uv_fs_t *req) { }

int main() {
  uv_fs_t read_req;
  uv_buf_t buffer;
  int fd;
  char bytes[64];

  buffer = uv_buf_init(bytes, 64);
  fd = uv_fs_open(NULL, &read_req, "/dev/zero", UV_FS_O_RDONLY, 0, NULL);
  uv_fs_read(uv_default_loop(), &read_req, fd, &buffer, 1, 0, on_read);
  uv_run(uv_default_loop(), UV_RUN_DEFAULT);
  uv_fs_close(NULL, &read_req, fd, NULL);

  uv_fs_req_cleanup(&read_req);
  uv_loop_close(uv_default_loop());
  return 0;
}

Hi, @trevnorris , I want to evaluate the performance improvements brought by this feature. I was wouldering is there any performance benchmarking available?

Hi, @bnoordhuis , I was wouldering is there any performance benchmarking available?

sgunin pushed a commit to sgunin/oe-openembedded-core-contrib that referenced this pull request Mar 17, 2024
* pseudo doesn't support io_uring yet as shown after nodejs was upgraded
  and nodejs-native >= 20.3.0 with libuv >= 1.45.0 which has:
  libuv/libuv#3952

* files created in do_install with nodejs-native aren't tracked by pseudo
  and will result in host-user-contamination QA issue or
  "KeyError: 'getpwuid(): uid not found" as documented in:
  shr-project/com.webos.app.minimal@bd23804

* this is much simpler test for io_uring without the need to build whole
  nodejs-native, it's based on:
  https://unixism.net/2020/04/io-uring-by-example-part-1-introduction/
  just using writev instead of readv

* if it works fine, the file "test" will be tracked in pseudo database
  since the creation in ${D} like:
  core2-64-oe-linux/io-uring-writev/1.0 $ sqlite3 pseudo/files.db "select * from files"
  1|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image|66305|48357743|0|0|16877|0|0
  2|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/test|66305|48316709|0|0|33188|0|0

  and it does in this case, because I haven't figured out how to call writev()
  without opening the fd of output file first where the openat() call gets
  intercepted by pseudo

  io-uring-writev/1.0 $ strace -v ./io-uring-writev test2 2>&1 | grep openat
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/usr/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "test2", O_WRONLY|O_CREAT, 0666) = 4

  while with libuv there was no openat() for the output files in strace

* add test with libuv as well and surprisingly it's still working in current pseudo
  oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0 $ sqlite3 pseudo/files.db "select * from files"
  1|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image|66305|62072917|0|0|16877|0|0
  2|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/test|66305|62061857|0|0|33188|0|0
  3|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/task-copy.h|66305|62061858|0|0|33188|0|0

Signed-off-by: Martin Jansa <[email protected]>
sgunin pushed a commit to sgunin/oe-openembedded-core-contrib that referenced this pull request Mar 17, 2024
* pseudo doesn't support io_uring yet as shown after nodejs was upgraded
  and nodejs-native >= 20.3.0 with libuv >= 1.45.0 which has:
  libuv/libuv#3952

* files created in do_install with nodejs-native aren't tracked by pseudo
  and will result in host-user-contamination QA issue or
  "KeyError: 'getpwuid(): uid not found" as documented in:
  shr-project/com.webos.app.minimal@bd23804

* this is much simpler test for io_uring without the need to build whole
  nodejs-native, it's based on:
  https://unixism.net/2020/04/io-uring-by-example-part-1-introduction/
  just using writev instead of readv

* if it works fine, the file "test" will be tracked in pseudo database
  since the creation in ${D} like:
  core2-64-oe-linux/io-uring-writev/1.0 $ sqlite3 pseudo/files.db "select * from files"
  1|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image|66305|48357743|0|0|16877|0|0
  2|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/test|66305|48316709|0|0|33188|0|0

  and it does in this case, because I haven't figured out how to call writev()
  without opening the fd of output file first where the openat() call gets
  intercepted by pseudo

  io-uring-writev/1.0 $ strace -v ./io-uring-writev test2 2>&1 | grep openat
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/usr/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "test2", O_WRONLY|O_CREAT, 0666) = 4

  while with libuv there was no openat() for the output files in strace

* add test with libuv as well and surprisingly it's still working in current pseudo
  oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0 $ sqlite3 pseudo/files.db "select * from files"
  1|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image|66305|62072917|0|0|16877|0|0
  2|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/test|66305|62061857|0|0|33188|0|0
  3|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/task-copy.h|66305|62061858|0|0|33188|0|0

Signed-off-by: Martin Jansa <[email protected]>
sgunin pushed a commit to sgunin/oe-openembedded-core-contrib that referenced this pull request Mar 17, 2024
* pseudo doesn't support io_uring yet as shown after nodejs was upgraded
  and nodejs-native >= 20.3.0 with libuv >= 1.45.0 which has:
  libuv/libuv#3952

* files created in do_install with nodejs-native aren't tracked by pseudo
  and will result in host-user-contamination QA issue or
  "KeyError: 'getpwuid(): uid not found" as documented in:
  shr-project/com.webos.app.minimal@bd23804

* this is much simpler test for io_uring without the need to build whole
  nodejs-native, it's based on:
  https://unixism.net/2020/04/io-uring-by-example-part-1-introduction/
  just using writev instead of readv

* if it works fine, the file "test" will be tracked in pseudo database
  since the creation in ${D} like:
  core2-64-oe-linux/io-uring-writev/1.0 $ sqlite3 pseudo/files.db "select * from files"
  1|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image|66305|48357743|0|0|16877|0|0
  2|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/test|66305|48316709|0|0|33188|0|0

  and it does in this case, because I haven't figured out how to call writev()
  without opening the fd of output file first where the openat() call gets
  intercepted by pseudo

  io-uring-writev/1.0 $ strace -v ./io-uring-writev test2 2>&1 | grep openat
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/usr/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "test2", O_WRONLY|O_CREAT, 0666) = 4

  while with libuv there was no openat() for the output files in strace

Signed-off-by: Martin Jansa <[email protected]>
shr-project added a commit to shr-distribution/oe-core that referenced this pull request May 30, 2024
* pseudo doesn't support io_uring yet as shown after nodejs was upgraded
  and nodejs-native >= 20.3.0 with libuv >= 1.45.0 which has:
  libuv/libuv#3952

* files created in do_install with nodejs-native aren't tracked by pseudo
  and will result in host-user-contamination QA issue or
  "KeyError: 'getpwuid(): uid not found" as documented in:
  shr-project/com.webos.app.minimal@bd23804

* this is much simpler test for io_uring without the need to build whole
  nodejs-native, it's based on:
  https://unixism.net/2020/04/io-uring-by-example-part-1-introduction/
  just using writev instead of readv

* if it works fine, the file "test" will be tracked in pseudo database
  since the creation in ${D} like:
  core2-64-oe-linux/io-uring-writev/1.0 $ sqlite3 pseudo/files.db "select * from files"
  1|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image|66305|48357743|0|0|16877|0|0
  2|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/test|66305|48316709|0|0|33188|0|0

  and it does in this case, because I haven't figured out how to call writev()
  without opening the fd of output file first where the openat() call gets
  intercepted by pseudo

  io-uring-writev/1.0 $ strace -v ./io-uring-writev test2 2>&1 | grep openat
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/usr/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "test2", O_WRONLY|O_CREAT, 0666) = 4

  while with libuv there was no openat() for the output files in strace

* add test with libuv as well and surprisingly it's still working in current pseudo
  oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0 $ sqlite3 pseudo/files.db "select * from files"
  1|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image|66305|62072917|0|0|16877|0|0
  2|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/test|66305|62061857|0|0|33188|0|0
  3|/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/io-uring-writev/1.0/image/task-copy.h|66305|62061858|0|0|33188|0|0

Signed-off-by: Martin Jansa <[email protected]>
liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Jul 23, 2025
Fix a valgrind warning that only manifested with clang (not gcc!) by
explicitly passing 0L instead of plain 0 as the |sigsz| argument to
io_uring_enter(). That is, pass a long instead of an int.

On x86_64, |sigsz| is passed on the stack (the other arguments are
passed in registers) but where gcc emits a `push $0` that zeroes the
entire stack slot, clang emits a `movl $0,(%rsp)` that leaves the upper
32 bits untouched.

It's academic though since we don't pass IORING_ENTER_EXT_ARG and the
kernel therefore completely ignores the argument.

Refs: libuv/libuv#3952
memory_order_release);

flags = atomic_load_explicit((_Atomic uint32_t*) iou->sqflags,
memory_order_acquire);
Copy link

@heiher heiher Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines of code are equivalent to:

  atomic_thread_fence(memory_order_release);
  iou->sqtail = *iou->sqtail + 1;
  flags = iou->sqflags;
  atomic_thread_fence(memory_order_acquire);

It seems impossible to prevent the load of sqflags from being reordered before the store to sqtail. e.g.

  flags = iou->sqflags;
  ...
  atomic_thread_fence(memory_order_release);
  iou->sqtail = *iou->sqtail + 1;
  atomic_thread_fence(memory_order_acquire);

I suggest the following change:

  atomic_store_explicit((_Atomic uint32_t*) iou->sqtail,
                        *iou->sqtail + 1,
                        memory_order_release);
  atomic_thread_fence(memory_order_seq_cst);

  if (iou->sqflags & UV__IORING_SQ_NEED_WAKEUP)

equivalent to:

  atomic_thread_fence(memory_order_release);
  iou->sqtail = *iou->sqtail + 1;
  atomic_thread_fence(memory_order_seq_cst);

  if (iou->sqflags & UV__IORING_SQ_NEED_WAKEUP)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref: https://github.com/torvalds/linux/blob/691d401c7e0e5ea34ac6f8151bc0696db1b2500a/io_uring/io_uring.c#L25-L28

 * When using the SQ poll thread (IORING_SETUP_SQPOLL), the application
 * needs to check the SQ flags for IORING_SQ_NEED_WAKEUP *after*
 * updating the SQ tail; a full memory barrier smp_mb() is needed
 * between.

liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Dec 16, 2025
Fix a valgrind warning that only manifested with clang (not gcc!) by
explicitly passing 0L instead of plain 0 as the |sigsz| argument to
io_uring_enter(). That is, pass a long instead of an int.

On x86_64, |sigsz| is passed on the stack (the other arguments are
passed in registers) but where gcc emits a `push $0` that zeroes the
entire stack slot, clang emits a `movl $0,(%rsp)` that leaves the upper
32 bits untouched.

It's academic though since we don't pass IORING_ENTER_EXT_ARG and the
kernel therefore completely ignores the argument.

Refs: libuv/libuv#3952
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants