rand: inform the optimiser that indexing is never out-of-bounds.#16965
Merged
bors merged 1 commit intorust-lang:masterfrom Sep 9, 2014
Merged
rand: inform the optimiser that indexing is never out-of-bounds.#16965bors merged 1 commit intorust-lang:masterfrom
bors merged 1 commit intorust-lang:masterfrom
Conversation
Contributor
Author
|
(Since this is somewhat crypto-related, I've been liberal with comments.) |
Contributor
|
I see: Before: After: And not modifying the field type, but just using |
Contributor
|
i.e. the diff for the last bench is just: diff --git a/src/librand/isaac.rs b/src/librand/isaac.rs
index 0f7cda4..d80999e 100644
--- a/src/librand/isaac.rs
+++ b/src/librand/isaac.rs
@@ -185,7 +185,7 @@ impl Rng for IsaacRng {
self.isaac();
}
self.cnt -= 1;
- self.rsl[self.cnt as uint]
+ self.rsl[self.cnt as u8 as uint]
}
}
@@ -416,7 +416,7 @@ impl Rng for Isaac64Rng {
self.isaac64();
}
self.cnt -= 1;
- unsafe { *self.rsl.unsafe_get(self.cnt) }
+ self.rsl[self.cnt as u8 as uint]
}
} |
Contributor
Author
|
Hm, that's interesting. That may be a good approach, although it fails to generalise if the RNG state size is increased. I wonder if just |
This uses a bitwise mask to ensure that there's no bounds checking for
the array accesses when generating the next random number. This isn't
costless, but the single instruction is nothing compared to the branch.
A `debug_assert` for "bounds check" is preserved to ensure that
refactoring doesn't accidentally break it (i.e. create values of `cnt`
that are out of bounds with the masking causing it to silently wrap-
around).
Before:
test test::rand_isaac ... bench: 990 ns/iter (+/- 24) = 808 MB/s
test test::rand_isaac64 ... bench: 614 ns/iter (+/- 25) = 1302 MB/s
After:
test test::rand_isaac ... bench: 877 ns/iter (+/- 134) = 912 MB/s
test test::rand_isaac64 ... bench: 470 ns/iter (+/- 30) = 1702 MB/s
(It also removes the unsafe code in Isaac64Rng.next_u64, with a *gain*
in performance; today is a good day.)
Contributor
Author
|
Thanks for the suggestion @dotdash, I've switched to a 'safer' version (i.e. less chance for mistakes to be silently ignored) which is, AFAICT, equally as fast, even in a tight loop. |
Contributor
Author
|
r? |
bors
added a commit
that referenced
this pull request
Sep 9, 2014
rand: inform the optimiser that indexing is never out-of-bounds. This uses a bitwise mask to ensure that there's no bounds checking for the array accesses when generating the next random number. This isn't costless, but the single instruction is nothing compared to the branch. A `debug_assert` for "bounds check" is preserved to ensure that refactoring doesn't accidentally break it (i.e. create values of `cnt` that are out of bounds with the masking causing it to silently wrap- around). Before: test test::rand_isaac ... bench: 990 ns/iter (+/- 24) = 808 MB/s test test::rand_isaac64 ... bench: 614 ns/iter (+/- 25) = 1302 MB/s After: test test::rand_isaac ... bench: 877 ns/iter (+/- 134) = 912 MB/s test test::rand_isaac64 ... bench: 470 ns/iter (+/- 30) = 1702 MB/s (It also removes the unsafe code in Isaac64Rng.next_u64, with a *gain* in performance; today is a good day.)
bors
added a commit
to rust-lang-ci/rust
that referenced
this pull request
Mar 31, 2024
fix: use lldb when debugging with C++ extension on MacOS See rust-lang/rust-analyzer#16901 (comment) This PR resolves the issue of being unable to debug using the C++ extension on macOS. By using special configurations for the `MIMode` on macOS, it enables the C++ extension to connect to lldb when debugging (without affecting other platforms).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
rand: inform the optimiser that indexing is never out-of-bounds.
This uses a bitwise mask to ensure that there's no bounds checking for
the array accesses when generating the next random number. This isn't
costless, but the single instruction is nothing compared to the branch.
A
debug_assertfor "bounds check" is preserved to ensure thatrefactoring doesn't accidentally break it (i.e. create values of
cntthat are out of bounds with the masking causing it to silently wrap-
around).
Before:
test test::rand_isaac ... bench: 990 ns/iter (+/- 24) = 808 MB/s
test test::rand_isaac64 ... bench: 614 ns/iter (+/- 25) = 1302 MB/s
After:
test test::rand_isaac ... bench: 877 ns/iter (+/- 134) = 912 MB/s
test test::rand_isaac64 ... bench: 470 ns/iter (+/- 30) = 1702 MB/s
(It also removes the unsafe code in Isaac64Rng.next_u64, with a gain
in performance; today is a good day.)