Fixes [SUREFIRE-1516]: Poor performance in reuseForks=false#253
Fixes [SUREFIRE-1516]: Poor performance in reuseForks=false#253jon-bell wants to merge 3 commits intoapache:masterfrom
Conversation
… from the host JVM after it sends BYE_ACK. Threads blocking on `read` may not be interruptable until they poll for interrupts (every 350msec), which can introduce significant latency for short-lived processes.
|
The build fails on macOS. |
| CommandReader.this.wakeupIterator(); | ||
| callListeners( command ); | ||
| break; | ||
| case BYE_ACK: |
There was a problem hiding this comment.
It is not very elegant solution.
In the ForkedBooter the listener is already registered commandReader.addByeAckListener( new CommandListener() , see the line 356. See the next comment.
There was a problem hiding this comment.
One thing I do not understand is why this change cause IT failure on macOS.
There was a problem hiding this comment.
It would be great to reproduce it with several runs of the build.
surefire-api/src/main/java/org/apache/maven/surefire/booter/CommandReader.java
Show resolved
Hide resolved
|
It looks like several of the tests are flaky, perhaps - when I try to reproduce locally on MacOS, I get a different set of tests failing. Unfortunately I do not have time to debug these tests this week. |
|
@Tibor17 can you trigger the Mac OS build to run again? I just tried to run the test suite locally again on my Mac and now none of the integration tests failed. I think that there is definitely a flaky test here (and the failure is not due to my change) |
|
@Tibor17 please note that all of the tests are now passing. |
|
@jon-bell I have triggered the build again. |
eolivelli
left a comment
There was a problem hiding this comment.
very nice trick.
Thank you for providing this enhancement
I hope we can commit this patch soon, I left my comment
surefire-api/src/main/java/org/apache/maven/surefire/booter/CommandReader.java
Show resolved
Hide resolved
|
The build failed again https://github.com/apache/maven-surefire/runs/352389395 |
|
@jon-bell |
b4c509d to
1364c62
Compare
|
@jon-bell Thx for contributing. I used your changes, updated the IT 705 and closed the Jira as fixed. |
|
Resolve #2559 |
Hi,
This PR resolves the performance bug noted in SUREFIRE-1516, which appears when using the
reuseForks=falseconfiguration. The root-cause of the observed performance problem comes from forked JVM teardown time.The issue is that the forked JVM should not block reading IO to read more from the host JVM after it sends BYE_ACK. Threads blocking on
readmay not be interruptable until they poll for interrupts (every 350msec for stdin), which can introduce significant latency for short-lived processes, such as surefire forked JVMs which are running just a single test at a time. This 350msec overhead can add up on projects that have thousands of test classes, each of which might take only several hundred msec to run.To measure the scope of the problem and confirm its resolution, I created a simple benchmark test suite, which consists of 100 JUnit test classes, each with a single test that calls
Thread.sleep(250). I instrumented the JVM to record the time that each JVM starts, the time that the test starts (as measured by JUnit), the time that the test ends (as measured by JUnit), and the time that the JVM terminates. For comparison, I also did the same experiment with ant and gradle.The table below shows the results, which represent the average time for each test (over the 100 samples):
Overall, most build systems took similar amounts of time to spin up the JVM, and all took the expected 250ms to run each test. You can see that the current
masterversion of surefire (b97df5a) takes an unusually high amount of time to tear down the forked JVM (in fact, 350 msec more, which is exactly the time for the JVM to interrupt a thread reading fromstdinexplained in this fantastic Stack Overflow post). This is an easy fix though: after receiving theBYE_ACKmessage, the forked JVM can stop reading commands from the main surefire process, since it's shutting down. After implementing this fix, the overhead goes away (as shown in 2fbe44f).