Skip to content

gem5 se.py user mode running userland/posix/pthread_self.c with 2 or more pthreads fails with "simulate() limit reached" #81

@cirosantilli

Description

@cirosantilli

Fix proposed at: https://gem5-review.googlesource.com/c/public/gem5/+/21606

At lkmc 3d0cc60 and gem5/gem5@08c79a1 file https://github.com/cirosantilli/linux-kernel-module-cheat/blob/3d0cc6014baa6dddb4dfa25b656ba37c5e1540d1/userland/posix/pthread_self.c should spawn 2 trivial threads that just print their IDs:

./run --arch aarch64 --cpus 3 --emulator gem5 --static \
  --userland userland/posix/pthread_self.c --userland-args 2

However, simulation fails with:

Exiting @ tick 18446744073709551615 because simulate() limit reached

It works if we span just one thread:

./run --arch aarch64 --cpus 2 --emulator gem5 --static \
  --userland userland/posix/pthread_self.c --userland-args 1 \
  --trace-stdout \
  --trace ExecAll,SyscallBase,SyscallVerbose

The failure is flaky: if you modify the program slightly to simplify useless bits like the printfs in the thread, then it starts working again. Increasing the number of cores often makes it fail again however.

With:

--trace SyscallBase,SyscallVerbose | grep  -E 'futex|clone|exit'

we see:

4607500: system.cpu0: T0 : syscall clone called w/arguments 4001536, 274877901216, 274877903200, 274877904768, 274877903200, 274877904768
4607500: system.cpu0: T0 : syscall clone returns 101
4787000: system.cpu1: T0 : syscall futex called w/arguments 4805768, 128, 2, 0, 4805768, 2
4787000: system.cpu1: T0 : syscall futex returns 0
5222500: system.cpu0: T0 : syscall futex called w/arguments 4805768, 129, 1, 0, 4578924, 4817799
5222500: system.cpu0: T0 : syscall futex returns 1
5719000: system.cpu0: T0 : syscall clone called w/arguments 4001536, 274869508512, 274869510496, 274869512064, 274869510496, 274869512064
5719000: system.cpu0: T0 : syscall clone returns 102
5840500: system.cpu0: T0 : syscall futex called w/arguments 4805768, 128, 2, 0, 4805768, 274869512064
5840500: system.cpu0: T0 : syscall futex returns 0
5895000: system.cpu2: T0 : syscall futex called w/arguments 4805768, 128, 2, 0, 4805768, 21824
5895000: system.cpu2: T0 : syscall futex returns 0
5982000: system.cpu1: T0 : syscall futex called w/arguments 4805768, 129, 1, 0, 4578773, 4817852
5982000: system.cpu1: T0 : syscall futex returns 1
5985000: system.cpu0: T0 : syscall futex called w/arguments 4805768, 128, 2, 0, 4805768, 2
5985000: system.cpu0: T0 : syscall futex returns 0
6081500: system.cpu1: T0 : syscall exit called w/arguments 0, 93, 4, 0, 274877902992, 0
6081500: system.cpu1: T0 : syscall exit returns 0
6743500: system.cpu2: T0 : syscall futex called w/arguments 4805768, 129, 1, 0, 4578773, 4817905
6743500: system.cpu2: T0 : syscall futex returns 1
6843000: system.cpu2: T0 : syscall exit called w/arguments 0, 93, 4, 0, 274869510288, 0
6843000: system.cpu2: T0 : syscall exit returns 0

Then on the kernel:

#define FUTEX_WAIT		0
#define FUTEX_WAKE		1
#define FUTEX_PRIVATE_FLAG	128

so the address is 4805768 for all calls,128 is WAIT and 129 WAKE, all timeouts are NULL, uaddr2 and val3 are ignored for both calls.

The return value of all WAIT was 0, which means "Returns 0 if the caller was woken up.", and of wall WAKE was 1, which means 1 caller was woken up.

All WAKE calls have val 1, which means wake up one thread. All WAIT calls have 2, which for all calls matches *4805768.

So the entire execution can be summarized as:

0: clone
1: futex WAIT
0: futex WAKE
0: clone
0: futex WAIT
2: futex WAIT
1: futex WAKE
0: futex WAIT
1: exit
2: futex WAKE
2: exit

TODO after 0 and 2 WAIT, 1 does WAKE, and both 0 and 2 seem to WAKE?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions