Commit 092ad3e
committed
Optimize branch structure of UTF-8 decoder routine
I like the asm which gcc -O3 generates on this modified code...
and guess what: my CPU likes it too!
(The asm is noticeably tighter, without any extra operations in the
path which dispatches to the code for decoding a 1-byte, 2-byte,
3-byte, or 4-byte character. It's just CMP, conditional jump, CMP,
conditional jump, CMP, conditional jump.
...Though I was admittedly impressed to see gcc could implement the
boolean expression `c >= 0xC2 && c <= 0xDF` with just 3 instructions:
add, CMP, then conditional jump. Pretty slick stuff there, guys.)
Benchmark results:
UTF-8, short - to UTF-16LE faster by 7.36% (0.0001 vs 0.0002)
UTF-8, short - to UTF-16BE faster by 6.24% (0.0001 vs 0.0002)
UTF-8, medium - to UTF-16BE faster by 4.56% (0.0003 vs 0.0003)
UTF-8, medium - to UTF-16LE faster by 4.00% (0.0003 vs 0.0003)
UTF-8, long - to UTF-16BE faster by 1.02% (0.0215 vs 0.0217)
UTF-8, long - to UTF-16LE faster by 1.01% (0.0209 vs 0.0211)1 parent d8b5b9f commit 092ad3e
1 file changed
+5
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
228 | | - | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
229 | 231 | | |
230 | 232 | | |
231 | 233 | | |
| |||
237 | 239 | | |
238 | 240 | | |
239 | 241 | | |
240 | | - | |
| 242 | + | |
241 | 243 | | |
242 | 244 | | |
243 | 245 | | |
| |||
262 | 264 | | |
263 | 265 | | |
264 | 266 | | |
265 | | - | |
| 267 | + | |
266 | 268 | | |
267 | 269 | | |
268 | 270 | | |
| |||
0 commit comments