Here is why I posted this question.
The first version is the actual implementation of findFirstNonNull in org.apache.commons.lang3.ObjectUtils.
When I saw it I was horrified. I posted it here to see if others see the problem. It seems it doesn't raise eyebrows.
The problem is that the task is simple, it could be executed in a simple loop and an if. Yet it uses the heavy artillery of a Stream and lambdas. It is what I would call using a cannon to kill a fly.
To see just how much overhead there is, I added a breakpoint on Object::notNull. Here is the stack trace.
nonNull:296, Objects (java.util)
test:-1, ObjectUtils$$Lambda/0x0000000801003448 (org.apache.commons.lang3)
accept:178, ReferencePipeline$2$1 (java.util.stream)
tryAdvance:1034, Spliterators$ArraySpliterator (java.util)
forEachWithCancel:129, ReferencePipeline (java.util.stream)
copyIntoWithCancel:527, AbstractPipeline (java.util.stream)
copyInto:513, AbstractPipeline (java.util.stream)
wrapAndCopyInto:499, AbstractPipeline (java.util.stream)
evaluateSequential:150, FindOps$FindOp (java.util.stream)
evaluate:234, AbstractPipeline (java.util.stream)
findFirst:647, ReferencePipeline (java.util.stream)
firstNonNull:593, ObjectUtils (org.apache.commons.lang3)
testC:75, BenchmarkFindFirst (benchmark.findfirst)
testAll:87, BenchmarkFindFirst (benchmark.findfirst)
main:103, BenchmarkFindFirst (benchmark.findfirst)
Sometimes I hear that the compiler will optimize it out, so I ran a benchmark. I ran each version 3 billion times with various inputs and timed it. The benchmark runs a pre-run before starting the timer. It runs the tests 5 times in a row.
Here is the result of my simple benchmark.
attempt: 1 2 3 4 5
TestA: 46194 ms 45646 ms 45648 ms 45575 ms 46066 ms
TestB: 1096 ms 1352 ms 802 ms 798 ms 798 ms
TestC: 76082 ms 75842 ms 75370 ms 75266 ms 74130 ms
- TestA is the Stream version above.
- TestB is the java loop version.
- TestC uses the actual findFirstNonNull from the apache library.
You can see that the Stream version is 30-40 times slower than the loop version.
[edit] After Abion47 commented that the test could be flawed, I rewrote the tests to make the inputs less predictable. And indeed, the difference is less dramatic. But the loop version is still 10x faster than the Streams version.
[/edit]
What blows my mind is that the loop version is the implementation of that method in apache commons until 2022. Then someone decided to convert it to the Streams version.
You might say the overhead is not important, but you never know whether people will use that function in a tight loop. In my opinion a code library is supposed to be efficient, well-tested and well-documented. Readability comes only after, and for more complicated code.
Is this post relevant? If you want, this is my code review of findFirstNonNull from the apache commons library.
nullis not documented. Is it different?! \$\endgroup\$