Simplify format_integer_with_underscore_sep#141369
Conversation
|
r? @notriddle rustbot has assigned @notriddle. Use |
lolbinarycat
left a comment
There was a problem hiding this comment.
a few issues but mainly seems like a decent idea to slim down an overly flexible test helper.
src/librustdoc/clean/utils.rs
Outdated
| num.as_bytes().rchunks(3).rev().intersperse(b"_").flatten().copied().map(char::from), | ||
| ) |
There was a problem hiding this comment.
using as_bytes and char::from like this is technically a logic error, as it will turn every byte of a UTF-8 sequence into the corresponding unicode codepoint.
it does happen to work here since we're never using non-ascii (if we were then reversing a string bytewise would be extremely problematic), but String::from_utf8(...).unwrap() represents the intent much more cleanly.
There was a problem hiding this comment.
Not that this function is hot enough to prefer performance over correctness,
but I think in this case it's quite obvious that this is ok since Displaying integers will always result in an ASCII-only string. We can also collect into a vector of chars, like the old version.
There was a problem hiding this comment.
If we do want to care about performance here, I would just use String::from_utf8_unchecked. This should actually be more performant than char::from since it fully erases any non-ascii code. it will also at least give us a panic on miri instead of just silent corruption if somehow this gets rewritten into something that does produce non-ascii.
there's also u8.as_ascii(), since String impls Extend<ascii::Char>.
There was a problem hiding this comment.
There's no way unsafe code would be justified here.
as_ascii seems like the right way to do this, since it panics if it's given non-ASCII text, but doesn't make the code much more complex:
fn format_integer_with_underscore_sep(num: u128, is_negative: bool) -> String {
let num = num.to_string();
let mut result = if is_negative { "-" } else { "" }.to_string();
result.extend(
num.as_bytes().rchunks(3).rev().intersperse(b"_").flatten().copied().map(|b| b.as_ascii().unwrap())
);
result
}There was a problem hiding this comment.
If we wouldn't accept unsafe here, then I think we definitely shouldn't accept the potential of silent logic errors.
One nice thing is that String::from_utf8 has a concrete signature, so you can use collect without type hints, but I'm really fine with any solution which isn't char::from. If nothing else using char::from in this way is a bad habit I don't think we should encourage.
There was a problem hiding this comment.
Ended up using str::as_ascii so there's only one unwrap, IMHO it looks kinda nice now!
BTW, this is not a test helper, it's used in the generation of docs :) |
yeah i noticed that after your comment, github put the diff break in a very confusing spot :p |
Only ever needs to handle decimal reprs
2c9305a to
5c735d1
Compare
|
@bors r+ rollup |
…r_with_underscore_sep, r=notriddle Simplify `format_integer_with_underscore_sep` Noticed that this helper fn only ever gets called with decimal-base-formatted ints, so can be simplified a lot by not trying to handle hex and octal radixes. Second commit is completely unrelated, just simplified some code I wrote a while back 😁
Rollup of 7 pull requests Successful merges: - #138896 (std: fix aliasing bug in UNIX process implementation) - #140832 (aarch64-linux: Default to FramePointer::NonLeaf) - #141065 (Updated std doctests for wasm) - #141369 (Simplify `format_integer_with_underscore_sep`) - #141374 (make shared_helpers exe function work for both cygwin and non-cygwin hosts) - #141398 (chore: fix typos in comment) - #141457 (Update mdbook to 0.4.50) Failed merges: - #141405 (GetUserProfileDirectoryW is now documented to always store the size) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #141369 - yotamofek:pr/rustdoc/format_integer_with_underscore_sep, r=notriddle Simplify `format_integer_with_underscore_sep` Noticed that this helper fn only ever gets called with decimal-base-formatted ints, so can be simplified a lot by not trying to handle hex and octal radixes. Second commit is completely unrelated, just simplified some code I wrote a while back 😁
…r-string, r=the8472,joshtriplett Add `FromIterator` impls for `ascii::Char`s to `String`s Wanted to `collect` ascii chars into a `String` while working on rust-lang#141369 , and was surprised these impls don't exist. Seems to me to be simply oversight. BTW, I only added `impl FromIterator<ascii::Char> for Cow<'_, str>`, without a corresponding `FromIterator<&Char>` impl, because there's no existing impl for `FromIterator<&char>`, but that might be oversight too. cc rust-lang#110998
…r-string, r=the8472,joshtriplett Add `FromIterator` impls for `ascii::Char`s to `String`s Wanted to `collect` ascii chars into a `String` while working on rust-lang#141369 , and was surprised these impls don't exist. Seems to me to be simply oversight. BTW, I only added `impl FromIterator<ascii::Char> for Cow<'_, str>`, without a corresponding `FromIterator<&Char>` impl, because there's no existing impl for `FromIterator<&char>`, but that might be oversight too. cc rust-lang#110998
Noticed that this helper fn only ever gets called with decimal-base-formatted ints, so can be simplified a lot by not trying to handle hex and octal radixes.
Second commit is completely unrelated, just simplified some code I wrote a while back 😁