Skip to content

Conversation

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Nov 7, 2024

https://bugs.ruby-lang.org/issues/20902

It turns out, memmove can take a long time for large buffers. Well, hardly surprising I suppose.

Here is a script you can use for experimentation:

SIZE = 1024*1024*10

def copy_buffers(size = SIZE)
  source = IO::Buffer.new(size)
  destination = IO::Buffer.new(size)

  destination.copy(source)
end

start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
threads = 8.times.map do
  Thread.new{copy_buffers}
end

threads.each(&:join)

duration = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time

puts "Duration: #{duration} seconds"

Again, we will have to hard code some magic number/heuristic (or compute it at startup) to determine when releasing the GVL makes sense. Maybe anything bigger than X pages?

@ioquatix
Copy link
Member Author

ioquatix commented Nov 7, 2024

Another idea is to introduce rb_memmove (and maybe rb_memcpy) which then can be used everywhere and release the GVL after a certain size.

@ioquatix ioquatix force-pushed the io-buffer-memmove-nogvl branch from 2ad4a25 to e42c666 Compare November 7, 2024 03:07
@ioquatix ioquatix force-pushed the io-buffer-memmove-nogvl branch 2 times, most recently from 0bee6ed to 270c616 Compare November 20, 2024 07:08
@ioquatix
Copy link
Member Author

I measured the difference:

GVL Threads Buffer Size Total Duration Throughput (MB/s)
Yes 1 1 0.12ms 8393.09
Yes 1 5 0.51ms 9857.7
Yes 1 10 1.12ms 8937.54
Yes 1 20 2.22ms 9015.95
Yes 2 1 0.24ms 8307.07
Yes 2 5 1.13ms 8819.58
Yes 2 10 1.49ms 13385.35
Yes 2 20 5.63ms 7110.8
Yes 4 1 0.92ms 4360.18
Yes 4 5 2.08ms 9606.58
Yes 4 10 4.51ms 8863.13
Yes 4 20 9.3ms 8601.41
Yes 8 1 1.22ms 6574.93
Yes 8 5 3.56ms 11239.27
Yes 8 10 7.31ms 10943.68
Yes 8 20 15.57ms 10274.99
Yes 16 1 1.95ms 8220.16
Yes 16 5 5.51ms 14518.05
Yes 16 10 13.77ms 11618.96
Yes 16 20 27.21ms 11759.43
Yes 32 1 3.24ms 9891.05
Yes 32 5 11.42ms 14007.41
Yes 32 10 21.64ms 14786.48
Yes 32 20 45.52ms 14060.25
No 1 1 0.13ms 7582.85
No 1 5 0.44ms 11248.55
No 1 10 1.11ms 9029.91
No 1 20 2.43ms 8228.42
No 2 1 0.18ms 11245.61
No 2 5 0.96ms 10396.76
No 2 10 1.9ms 10501.59
No 2 20 3.16ms 12656.77
No 4 1 0.69ms 5827.76
No 4 5 1.15ms 17440.54
No 4 10 2.31ms 17307.79
No 4 20 4.11ms 19483.68
No 8 1 0.67ms 11954.1
No 8 5 1.3ms 30713.68
No 8 10 2.05ms 38990.98
No 8 20 4.15ms 38552.37
No 16 1 0.96ms 16698.03
No 16 5 1.46ms 54782.47
No 16 10 2.74ms 58295.64
No 16 20 4.89ms 65482.43
No 32 1 1.82ms 17554.27
No 32 5 2.68ms 59673.59
No 32 10 3.87ms 82733.34
No 32 20 6.93ms 92297.47

Code

puts "| GVL | Threads | Buffer Size | Total Duration | Throughput (MB/s) |"
puts "|-----|---------|-------------|----------------|-------------------|"

[1, 2, 4, 8, 16, 32].each do |thread_count|
  [1, 5, 10, 20].each do |buffer_size|
    size = 1024*1024*(buffer_size)

    source = IO::Buffer.new(size)
    destinations = thread_count.times.map{IO::Buffer.new(size)}

    start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)

    10.times do
      threads = (thread_count).times.map do |index|
        Thread.new{source.copy(destinations[index])}
      end

      threads.each(&:join)
    end

    duration = (Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time) / 10
    throughput = (size * thread_count / duration) / 1024**2

    duration_ms = duration * 1000

    gvl = ENV.fetch('GVL', 'Yes')

    puts "| #{gvl.rjust(3)} | #{thread_count.to_s.rjust(7)} | #{buffer_size.to_s.rjust(11)} | #{duration_ms.round(2).to_s.rjust(12)}ms | #{throughput.round(2).to_s.rjust(17)} |"
  end
end

@ioquatix ioquatix force-pushed the io-buffer-memmove-nogvl branch from 270c616 to 054cd79 Compare November 20, 2024 07:21
@ioquatix ioquatix marked this pull request as ready for review November 20, 2024 07:21
@ioquatix ioquatix force-pushed the io-buffer-memmove-nogvl branch from c82d9b6 to b485f1c Compare November 20, 2024 07:34
@ioquatix ioquatix force-pushed the io-buffer-memmove-nogvl branch from b485f1c to b4c00a0 Compare November 20, 2024 07:44
@ioquatix ioquatix merged commit 3c0b09a into ruby:master Nov 20, 2024
69 checks passed
@ioquatix ioquatix deleted the io-buffer-memmove-nogvl branch November 20, 2024 08:27
@ioquatix ioquatix added this to the v3.4.0 milestone Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants