compare_to command: add geometric mean #79

vstinner · 2020-10-13T09:11:21Z

When comparing suites with more than one benchmark, compute the
geometric mean of benchmark speeds, to compare a whole suite to the
reference suite with a single number.

When comparing suites with more than one benchmark, compute the geometric mean of benchmark speeds, to compare a whole suite to the reference suite with a single number.

vstinner · 2020-10-13T09:23:09Z

@methane @pablogsal @serhiy-storchaka: Would you mind to have a look at this new feature? I'm not sure that I compute the geometric mean of the correct thing.

IMO it's a big feature and it should help a lot to compare two benchmark suites using a single number rather than 30 numbers.

A geometric mean has no unit. It's a unusual value. In short, geo mean > 1.0 means "faster", geo mean < 1.0 means "slower".

PyPy uses the geometric mean to compare PyPy to CPython as a single number. speed.pypy.org announces:

"The geometric average of all benchmarks is 0.24 or 4.2 times faster than cpython"

I'm not sure that speed.pypy.org and my PR compute the geometric mean of the same thing, since it announces that 0.24 means "faster".

My PR computes the geometric mean of all "speeds". The speed of a benchmark is the ratio: (benchmark mean) / (reference benchmark mean).

Set also the version to 2.1.0

vstinner · 2020-10-13T10:46:17Z

I'm not sure that speed.pypy.org and my PR compute the geometric mean of the same thing, since it announces that 0.24 means "faster".

Oops, I computed the geometric mean backwards. I normalized reference / benchmark, but the correct fomula is benchmark / reference.

I completed my PR to update the documentation which explains how the geometric mean is computed and what it means.

Use (benchmark mean) / (reference mean), rather than (reference mean / benchmark mean), to use the same ratio than the geometric mean: normalize the mean to the reference.

vstinner · 2020-10-13T11:42:52Z

I wrote this PR when I saw bench_results.txt of https://bugs.python.org/issue41972 : it's really hard to read all these numbers and understand if it's faster or slower.

vstinner · 2020-10-26T12:53:53Z

I rewrote this PR as a serie of commits. It's now implemented in the master branch, I close this PR.

vstinner · 2020-10-26T12:56:42Z

The last commit adding the feature to the "compare" command is the commit 0518a22.

Example:

$ python3 -m pyperf compare_to ./pyperf/tests/mult_list_py36.json ./pyperf/tests/mult_list_py37.json
[1]*1000: Mean +- std dev: [mult_list_py36] 2.13 us +- 0.06 us -> [mult_list_py37] 2.09 us +- 0.04 us: 1.02x faster (-2%)
[1,2]*1000: Mean +- std dev: [mult_list_py36] 3.70 us +- 0.05 us -> [mult_list_py37] 5.28 us +- 0.09 us: 1.42x slower (+42%)
[1,2,3]*1000: Mean +- std dev: [mult_list_py36] 4.61 us +- 0.13 us -> [mult_list_py37] 6.05 us +- 0.11 us: 1.31x slower (+31%)

Geometric mean: 1.22 (slower)


$ python3 -m pyperf compare_to ./pyperf/tests/mult_list_py36.json ./pyperf/tests/mult_list_py37.json -G
Slower (2):
- [1,2]*1000: 3.70 us +- 0.05 us -> 5.28 us +- 0.09 us: 1.42x slower (+42%)
- [1,2,3]*1000: 4.61 us +- 0.13 us -> 6.05 us +- 0.11 us: 1.31x slower (+31%)

Faster (1):
- [1]*1000: 2.13 us +- 0.06 us -> 2.09 us +- 0.04 us: 1.02x faster (-2%)

Geometric mean: 1.22 (slower)


$ python3 -m pyperf compare_to ./pyperf/tests/mult_list_py36.json ./pyperf/tests/mult_list_py37.json --table
+----------------+----------------+------------------------------+
| Benchmark      | mult_list_py36 | mult_list_py37               |
+================+================+==============================+
| [1]*1000       | 2.13 us        | 2.09 us: 1.02x faster (-2%)  |
+----------------+----------------+------------------------------+
| [1,2]*1000     | 3.70 us        | 5.28 us: 1.42x slower (+42%) |
+----------------+----------------+------------------------------+
| [1,2,3]*1000   | 4.61 us        | 6.05 us: 1.31x slower (+31%) |
+----------------+----------------+------------------------------+
| Geometric mean | (ref)          | 1.22 (slower)                |
+----------------+----------------+------------------------------+


$ python3 -m pyperf compare_to ./pyperf/tests/mult_list_py36.json ./pyperf/tests/mult_list_py37.json --table -G
+----------------+----------------+------------------------------+
| Benchmark      | mult_list_py36 | mult_list_py37               |
+================+================+==============================+
| [1]*1000       | 2.13 us        | 2.09 us: 1.02x faster (-2%)  |
+----------------+----------------+------------------------------+
| [1,2,3]*1000   | 4.61 us        | 6.05 us: 1.31x slower (+31%) |
+----------------+----------------+------------------------------+
| [1,2]*1000     | 3.70 us        | 5.28 us: 1.42x slower (+42%) |
+----------------+----------------+------------------------------+
| Geometric mean | (ref)          | 1.22 (slower)                |
+----------------+----------------+------------------------------+

compare_to command: add geometric mean

ee7eac9

When comparing suites with more than one benchmark, compute the geometric mean of benchmark speeds, to compare a whole suite to the reference suite with a single number.

vstinner added 2 commits October 13, 2020 12:13

Document the geometric mean

5c9b8aa

Set also the version to 2.1.0

geo mean: invert the result

d770858

vstinner added 2 commits October 13, 2020 12:49

Enhance documentation

a940661

CompareResult.speed is now bench/ref

666d6e5

Use (benchmark mean) / (reference mean), rather than (reference mean / benchmark mean), to use the same ratio than the geometric mean: normalize the mean to the reference.

vstinner mentioned this pull request Oct 13, 2020

Add geometric mean #76

Closed

Fix group-by-speed

b63670c

vstinner closed this Oct 26, 2020

vstinner deleted the geo_mean branch October 26, 2020 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compare_to command: add geometric mean #79

compare_to command: add geometric mean #79

Uh oh!

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 26, 2020

Uh oh!

vstinner commented Oct 26, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

compare_to command: add geometric mean #79

compare_to command: add geometric mean #79

Uh oh!

Conversation

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 13, 2020

Uh oh!

vstinner commented Oct 26, 2020

Uh oh!

vstinner commented Oct 26, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant