Conversation
As of LLVM 9.0, certain calls to memcmp may be converted to bcmp, which I guess
could save a single subtraction on some architectures. [1]
bcmp is just like memcmp except instead of returning the difference between the
two differing bytes, it returns non-zero instead. As such, memcmp is a valid
implementation of bcmp.
If we care about size, bcmp should just call memcmp.
If we care about speed, we can change bcmp to look like this instead:
```rust
pub unsafe extern "C" fn bcmp(s1: *const u8, s2: *const u8, n: usize) -> i32 {
let mut i = 0;
while i < n {
let a = *s1.offset(i as isize);
let b = *s2.offset(i as isize);
if a != b {
return 1;
}
i += 1;
}
0
}
```
In this PR I do not address any changes which may or may not be needed for arm
aebi as I lack proper test hardware.
[1]: https://releases.llvm.org/9.0.0/docs/ReleaseNotes.html#noteworthy-optimizations
|
This looks reasonable to me, thanks! |
|
it's not as much about saving a single subtraction as allowing faster vectorized code, since |
|
In what context did you encounter the need for bcmp? LLVM only emits bcmp if it is known to exist, such as on gnu targets. If you got an undefined symbol error, I would expect that it's due to an error in the target specification. |
|
I've encountered it when using no_std + a Linux based target. Not sure what OP's use case is though. |
Bump compiler-builtins - rust-lang/compiler-builtins#306 - rust-lang/compiler-builtins#309 - rust-lang/compiler-builtins#310 - rust-lang/compiler-builtins#311 - rust-lang/compiler-builtins#312 - rust-lang/compiler-builtins#313 - rust-lang/compiler-builtins#315 - rust-lang/compiler-builtins#317 - rust-lang/compiler-builtins#323 - rust-lang/compiler-builtins#324 - rust-lang/compiler-builtins#328 Adds support for backtraces from `__rust_probestack` plus other goodies. r? @alexcrichton
As of LLVM 9.0, certain calls to memcmp may be converted to bcmp, which I guess
could save a single subtraction on some architectures. [1]
bcmp is just like memcmp except instead of returning the difference between the
two differing bytes, it returns non-zero instead. As such, memcmp is a valid
implementation of bcmp.
If we care about size, bcmp should just call memcmp.
If we care about speed, we can change bcmp to look like this instead:
```rust
pub unsafe extern "C" fn bcmp(s1: *const u8, s2: *const u8, n: usize) -> i32 {
let mut i = 0;
while i < n {
let a = *s1.offset(i as isize);
let b = *s2.offset(i as isize);
if a != b {
return 1;
}
i += 1;
}
0
}
```
In this PR I do not address any changes which may or may not be needed for arm
aebi as I lack proper test hardware.
[1]: https://releases.llvm.org/9.0.0/docs/ReleaseNotes.html#noteworthy-optimizations
As of LLVM 9.0, certain calls to memcmp may be converted to bcmp, which I guess
could save a single subtraction on some architectures.
bcmp is just like memcmp except instead of returning the difference between the
two differing bytes, it returns non-zero instead. As such, memcmp is a valid
implementation of bcmp.
If we care about size, bcmp should just call memcmp.
If we care about "speed", we can change bcmp to save a single sub like this instead:
I'm unsure if the difference in icache size outweighs the single sub in programs which otherwise use memcmp, but I'll trust the LLVM devs to know what they're doing.
In this PR I do not address any changes which may or may not be needed for arm
aebi as I lack proper test hardware.