-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
chars/bytes confusion in the error emitter #44080
Copy link
Copy link
Closed
Labels
A-diagnosticsArea: Messages for errors, warnings, and lintsArea: Messages for errors, warnings, and lintsC-bugCategory: This is a bug.Category: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Metadata
Metadata
Assignees
Labels
A-diagnosticsArea: Messages for errors, warnings, and lintsArea: Messages for errors, warnings, and lintsC-bugCategory: This is a bug.Category: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Type
Fields
Give feedbackNo fields configured for issues without a type.
src/librustc_errors/snippet.rshas big comment saying that the column info is provided in characters, not in bytes. However, the error emitter doesn't care about that at all and uses these like byte offsets all over the place. This leads to bugs like #44023 and #44078 .As an example, look how span printing varies with varying characters used:
Correct case:
Now add an emoji character:
Note how its off by one char now. This can stack up:
If I didn't use any spaces at all, I'd run into #44078.
Now this can be fixed by going through the emitter code and looking for all places where the pos is used in a byte position fashion. A much more proper fix instead is to stop trusting that people read comments and encode this via the type system. There is already a mechanism for that inside the compiler, its
libsyntax_pos::CharPos! Just convert the types ofstart_col,end_colmembers of theMultilineAnnotationandAnnotationstructs toCharPos, or maybe toBytePosif that's preferred.