-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
Tracking issue: UTF-8 decoder in libcore #33906
Copy link
Copy link
Closed
Labels
B-unstableBlocker: Implemented in the nightly compiler and unstable.Blocker: Implemented in the nightly compiler and unstable.C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCCategory: An issue tracking the progress of sth. like the implementation of an RFCT-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.Relevant to the library API team, which will review and decide on the PR/issue.final-comment-periodIn the final comment period and will be merged soon unless new substantive objections are raised.In the final comment period and will be merged soon unless new substantive objections are raised.
Metadata
Metadata
Assignees
Labels
B-unstableBlocker: Implemented in the nightly compiler and unstable.Blocker: Implemented in the nightly compiler and unstable.C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCCategory: An issue tracking the progress of sth. like the implementation of an RFCT-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.Relevant to the library API team, which will review and decide on the PR/issue.final-comment-periodIn the final comment period and will be merged soon unless new substantive objections are raised.In the final comment period and will be merged soon unless new substantive objections are raised.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Update (@SimonSapin): this is now the tracking issue for these items in both
core::charandstd::char:decode_utf8()which takes an iterable ofu8and returnDecodeUtf8DecodeUtf8which implementsIterator<Item=Result<char, InvalidSequence>>InvalidSequencewhich is opaqueOriginal issue:
In libcore we have a facility to encode a character to UTF-8, i.e.
char::EncodeUtf8, but no facility to decode a character from potentially-invalid UTF-8, and return 0xFFFD if it reads an invalid sequence, which seems a surprising omission to me as a libcore user, given in libstd we havestring::String::from_utf8_lossy.These options came to mind:
str::next_code_point_lossyor so which behaves asstr::next_code_pointbut checks whether its input is valid and returns 0xFFFD if notDecodeUtf8which one can make from an arbitrary iterator of bytes, which decodes them