-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
Rust 1.25.0 regressed the performance of encoding_rs's UTF-8 validation on i686 #49873
Copy link
Copy link
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.P-mediumMedium priorityMedium priorityWG-llvmWorking group: LLVM backend code generationWorking group: LLVM backend code generationregression-from-stable-to-stablePerformance or correctness regression from one stable version to another.Performance or correctness regression from one stable version to another.
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.P-mediumMedium priorityMedium priorityWG-llvmWorking group: LLVM backend code generationWorking group: LLVM backend code generationregression-from-stable-to-stablePerformance or correctness regression from one stable version to another.Performance or correctness regression from one stable version to another.
Type
Fields
Give feedbackNo fields configured for issues without a type.
When Firefox switched from Rust 1.24.0 to Rust 1.25.0, the win32 performance of encoding_rs's UTF-8 validation function dropped 12.5% when used on ASCII input. encoding_rs's UTF-8 validation function is a fork of the Rust standard library validation function that replaces the ASCII acceleration ALU trick that autovectorizes on x86_64 but not on i686 and works only in the aligned case with explicit SIMD code that deals with both the aligned and unaligned cases.
When the input is all ASCII, the function should stay in either the aligned-case or the unaligned-case inner loop that loads 16 bytes using
movdqaormovdqu, respectively, performspmovmskbon the xmm register and compares the result to zero jumping back to the start of the loop if it is zero.When compiled for i686 Linux with opt level 2 (which Firefox uses) using Rust 1.24.0, the result is exactly as expected.
Unaligned:
Aligned:
(Windows wouldn't let me see the asm due to LLVM deeming the IR invalid with
--emit asm.)When compiled with Rust 1.25.0, the result is more complicated:
movdqaand two instances ofmovdqusuggesting that the first trip through the loop has been unrolled to be a separate copy from the loop proper.Both of these transformations look like plausible optimizations, but considering the performance result from Firefox CI, it seems these transformations made performance worse.
The asm was obtained by compiling encoding_rs (Firefox uses 0.7.2) using
RUSTC_BOOTSTRAP=1 RUSTFLAGS='-C opt-level=2 --emit asm' cargo build --target i686-unknown-linux-gnu --release --features simd-acceland searching forutf8_valid_up_toin the.sfile.