Lexer: say that lifetime-like tokens can't be immediately followed by '#1479
Lexer: say that lifetime-like tokens can't be immediately followed by '#1479ehuss merged 1 commit intorust-lang:masterfrom
Conversation
…d by ' Forms like 'ab'c are rejected, so we need some way to explain why they don't tokenise as two consecutive LIFETIME_OR_LABEL tokens. Address this by adding "not immediately followed by `'`" to each of the lexer rules for the lifetime-like tokens. This also means there can be no ambiguity between CHAR_LITERAL and these tokens (at present we don't say how such ambiguities are resolved).
ehuss
left a comment
There was a problem hiding this comment.
Thanks! I'll go ahead and merge, but for the most part the reference has not done a good job of handling ambiguity and precedence in the lexer or grammar. I'm not sure this is the ultimate approach to take, since I think there are several other rules that have ambiguity.
For example, there is nothing that clarifies if x'a' is a RESERVED_TOKEN_SINGLE_QUOTE or a IDENTIFIER CHAR_LITERAL. One possibility is to have a disambiguation rule that prefers "longest match" for ambiguity. So 'a'a would be a CHAR_LITERAL because that is a longer match than two LIFETIME_TOKENs (or 'a'1 is a CHAR INTEGER, because CHAR is longer than LIFETIME). Then we wouldn't need to explicitly state these kinds of rules. But I don't know if that is the best approach.
Update books ## rust-lang/reference 3 commits in 3417f866932cb1c09c6be0f31d2a02ee01b4b95d..5afb503a4c1ea3c84370f8f4c08a1cddd1cdf6ad 2024-03-06 21:29:54 UTC to 2024-02-28 04:06:45 UTC - Input format (rust-lang/reference#1459) - Lexer: say that lifetime-like tokens can't be immediately followed by ' (rust-lang/reference#1479) - Patterns and enums (rust-lang/reference#1460) ## rust-lang/rust-by-example 2 commits in 57f1e708f5d5850562bc385aaf610e6af14d6ec8..e093099709456e6fd74fecd2505fdf49a2471c10 2024-03-08 23:30:57 UTC to 2024-02-26 21:10:20 UTC - While-Let Unable to compile code example on page (rust-lang/rust-by-example#1819) - Update new_types.md wording (rust-lang/rust-by-example#1823) ## rust-lang/rustc-dev-guide 14 commits in 7b0ef5b..8a5d647 2024-03-11 10:37:18 UTC to 2024-02-29 09:46:28 UTC - update rustc-driver-interacting-with-the-ast.md (rust-lang/rustc-dev-guide#1930) - Update rustc-driver-getting-diagnostics.md (rust-lang/rustc-dev-guide#1931) - Document that test names cannot contain dots (rust-lang/rustc-dev-guide#1927) - Update overview.md (rust-lang/rustc-dev-guide#1898) - actually need to fix two occurances (rust-lang/rustc-dev-guide#1925) - fix broken links (rust-lang/rustc-dev-guide#1924) - next-solver: document caching (rust-lang/rustc-dev-guide#1923) - Add compiletest docs for FileCheck prefixes and `//@ filecheck-flags:` (rust-lang/rustc-dev-guide#1914) - Use different type in an example (rust-lang/rustc-dev-guide#1908) - Update run-make test description (rust-lang/rustc-dev-guide#1920) - Add some more details on feature gating (rust-lang/rustc-dev-guide#1891) - make shell.nix better (rust-lang/rustc-dev-guide#1858) - opaque types in new solver (rust-lang/rustc-dev-guide#1918) - add implied bounds doc (rust-lang/rustc-dev-guide#1915)
Update books ## rust-lang/reference 3 commits in 3417f866932cb1c09c6be0f31d2a02ee01b4b95d..5afb503a4c1ea3c84370f8f4c08a1cddd1cdf6ad 2024-03-06 21:29:54 UTC to 2024-02-28 04:06:45 UTC - Input format (rust-lang/reference#1459) - Lexer: say that lifetime-like tokens can't be immediately followed by ' (rust-lang/reference#1479) - Patterns and enums (rust-lang/reference#1460) ## rust-lang/rust-by-example 2 commits in 57f1e708f5d5850562bc385aaf610e6af14d6ec8..e093099709456e6fd74fecd2505fdf49a2471c10 2024-03-08 23:30:57 UTC to 2024-02-26 21:10:20 UTC - While-Let Unable to compile code example on page (rust-lang/rust-by-example#1819) - Update new_types.md wording (rust-lang/rust-by-example#1823) ## rust-lang/rustc-dev-guide 14 commits in 7b0ef5b..8a5d647 2024-03-11 10:37:18 UTC to 2024-02-29 09:46:28 UTC - update rustc-driver-interacting-with-the-ast.md (rust-lang/rustc-dev-guide#1930) - Update rustc-driver-getting-diagnostics.md (rust-lang/rustc-dev-guide#1931) - Document that test names cannot contain dots (rust-lang/rustc-dev-guide#1927) - Update overview.md (rust-lang/rustc-dev-guide#1898) - actually need to fix two occurances (rust-lang/rustc-dev-guide#1925) - fix broken links (rust-lang/rustc-dev-guide#1924) - next-solver: document caching (rust-lang/rustc-dev-guide#1923) - Add compiletest docs for FileCheck prefixes and `//@ filecheck-flags:` (rust-lang/rustc-dev-guide#1914) - Use different type in an example (rust-lang/rustc-dev-guide#1908) - Update run-make test description (rust-lang/rustc-dev-guide#1920) - Add some more details on feature gating (rust-lang/rustc-dev-guide#1891) - make shell.nix better (rust-lang/rustc-dev-guide#1858) - opaque types in new solver (rust-lang/rustc-dev-guide#1918) - add implied bounds doc (rust-lang/rustc-dev-guide#1915)
Rollup merge of rust-lang#122339 - rustbot:docs-update, r=ehuss Update books ## rust-lang/reference 3 commits in 3417f866932cb1c09c6be0f31d2a02ee01b4b95d..5afb503a4c1ea3c84370f8f4c08a1cddd1cdf6ad 2024-03-06 21:29:54 UTC to 2024-02-28 04:06:45 UTC - Input format (rust-lang/reference#1459) - Lexer: say that lifetime-like tokens can't be immediately followed by ' (rust-lang/reference#1479) - Patterns and enums (rust-lang/reference#1460) ## rust-lang/rust-by-example 2 commits in 57f1e708f5d5850562bc385aaf610e6af14d6ec8..e093099709456e6fd74fecd2505fdf49a2471c10 2024-03-08 23:30:57 UTC to 2024-02-26 21:10:20 UTC - While-Let Unable to compile code example on page (rust-lang/rust-by-example#1819) - Update new_types.md wording (rust-lang/rust-by-example#1823) ## rust-lang/rustc-dev-guide 14 commits in 7b0ef5b..8a5d647 2024-03-11 10:37:18 UTC to 2024-02-29 09:46:28 UTC - update rustc-driver-interacting-with-the-ast.md (rust-lang/rustc-dev-guide#1930) - Update rustc-driver-getting-diagnostics.md (rust-lang/rustc-dev-guide#1931) - Document that test names cannot contain dots (rust-lang/rustc-dev-guide#1927) - Update overview.md (rust-lang/rustc-dev-guide#1898) - actually need to fix two occurances (rust-lang/rustc-dev-guide#1925) - fix broken links (rust-lang/rustc-dev-guide#1924) - next-solver: document caching (rust-lang/rustc-dev-guide#1923) - Add compiletest docs for FileCheck prefixes and `//@ filecheck-flags:` (rust-lang/rustc-dev-guide#1914) - Use different type in an example (rust-lang/rustc-dev-guide#1908) - Update run-make test description (rust-lang/rustc-dev-guide#1920) - Add some more details on feature gating (rust-lang/rustc-dev-guide#1891) - make shell.nix better (rust-lang/rustc-dev-guide#1858) - opaque types in new solver (rust-lang/rustc-dev-guide#1918) - add implied bounds doc (rust-lang/rustc-dev-guide#1915)
Forms like
'ab'care rejected, so we need some way to explain why they don't tokenise as two consecutive LIFETIME_OR_LABEL tokens.I think the best way to do this, given the Reference's current approach, is simply to add "not immediately followed by
'" to the lexer rules for the lifetime-like tokens.That matches what the implementation (
lifetime_or_char()) is doing, so it's not likely to be wrong, and this chapter already has some cases of lookahead of this sort.It also means there can be no ambiguity between CHAR_LITERAL and these tokens (I think the intent is that we have a traditional "longest matching token wins" rule, which would give the right result here, but that isn't explicitly stated anywhere).