Skip to content

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented May 27, 2024

Use stringlib to specialize unicode_repr() for each string kind (UCS1, UCS2, UCS4).

Benchmark:

+-------------------------------------+---------+----------------------+
| Benchmark                           | ref     | change2              |
+=====================================+=========+======================+
| repr('abc')                         | 100 ns  | 103 ns: 1.02x slower |
+-------------------------------------+---------+----------------------+
| repr('a' * 100)                     | 369 ns  | 369 ns: 1.00x slower |
+-------------------------------------+---------+----------------------+
| repr(('a' + squote) * 100)          | 1.21 us | 946 ns: 1.27x faster |
+-------------------------------------+---------+----------------------+
| repr(('a' + nl) * 100)              | 1.23 us | 907 ns: 1.36x faster |
+-------------------------------------+---------+----------------------+
| repr(dquote + ('a' + squote) * 100) | 1.08 us | 858 ns: 1.25x faster |
+-------------------------------------+---------+----------------------+
| Geometric mean                      | (ref)   | 1.16x faster         |
+-------------------------------------+---------+----------------------+

Use stringlib to specialize unicode_repr() for each string kind
(UCS1, UCS2, UCS4).

Benchmark:

+-------------------------------------+---------+----------------------+
| Benchmark                           | ref     | change2              |
+=====================================+=========+======================+
| repr('abc')                         | 100 ns  | 103 ns: 1.02x slower |
+-------------------------------------+---------+----------------------+
| repr('a' * 100)                     | 369 ns  | 369 ns: 1.00x slower |
+-------------------------------------+---------+----------------------+
| repr(('a' + squote) * 100)          | 1.21 us | 946 ns: 1.27x faster |
+-------------------------------------+---------+----------------------+
| repr(('a' + nl) * 100)              | 1.23 us | 907 ns: 1.36x faster |
+-------------------------------------+---------+----------------------+
| repr(dquote + ('a' + squote) * 100) | 1.08 us | 858 ns: 1.25x faster |
+-------------------------------------+---------+----------------------+
| Geometric mean                      | (ref)   | 1.16x faster         |
+-------------------------------------+---------+----------------------+
@vstinner
Copy link
Member Author

Benchmark:

import pyperf
runner = pyperf.Runner()
squote = "'"
dquote = '"'
nl = '\n'
runner.bench_func("repr('abc')", repr, 'abc')
runner.bench_func("repr('a' * 100)", repr, 'a' * 100)
runner.bench_func("repr(('a' + squote) * 100)", repr, ('a' + squote) * 100)
runner.bench_func("repr(('a' + nl) * 100)", repr, ('a' + nl) * 100)
runner.bench_func("repr(dquote + ('a' + squote) * 100)", repr, dquote + ('a' + squote) * 100)

@vstinner
Copy link
Member Author

cc @serhiy-storchaka

@vstinner
Copy link
Member Author

This is a first step. The second step will be to avoid a temporary string in PyUnicode_FromFormat("%R", str_obj).

@vstinner
Copy link
Member Author

This is a first step. The second step will be to avoid a temporary string in PyUnicode_FromFormat("%R", str_obj).

I implemented the second step locally. Sadly, it's slower! Not faster. IMO the first step (making the code faster) is still worth it :-)

@vstinner vstinner merged commit 0518edc into python:main May 28, 2024
@vstinner vstinner deleted the unicode_repr branch May 28, 2024 16:05
estyxx pushed a commit to estyxx/cpython that referenced this pull request Jul 17, 2024
Use stringlib to specialize unicode_repr() for each string kind
(UCS1, UCS2, UCS4).

Benchmark:

+-------------------------------------+---------+----------------------+
| Benchmark                           | ref     | change2              |
+=====================================+=========+======================+
| repr('abc')                         | 100 ns  | 103 ns: 1.02x slower |
+-------------------------------------+---------+----------------------+
| repr('a' * 100)                     | 369 ns  | 369 ns: 1.00x slower |
+-------------------------------------+---------+----------------------+
| repr(('a' + squote) * 100)          | 1.21 us | 946 ns: 1.27x faster |
+-------------------------------------+---------+----------------------+
| repr(('a' + nl) * 100)              | 1.23 us | 907 ns: 1.36x faster |
+-------------------------------------+---------+----------------------+
| repr(dquote + ('a' + squote) * 100) | 1.08 us | 858 ns: 1.25x faster |
+-------------------------------------+---------+----------------------+
| Geometric mean                      | (ref)   | 1.16x faster         |
+-------------------------------------+---------+----------------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant