Clean up ast::Attribute, ast::CrateConfig, and string interning#37824
Clean up ast::Attribute, ast::CrateConfig, and string interning#37824bors merged 12 commits intorust-lang:masterfrom
ast::Attribute, ast::CrateConfig, and string interning#37824Conversation
eddyb
left a comment
There was a problem hiding this comment.
LGTM modulo comments. cc @rust-lang/compiler
src/librustc/hir/map/definitions.rs
Outdated
There was a problem hiding this comment.
I think that these are InternedString so they hash their contents instead of the integer.
There was a problem hiding this comment.
Good point, I got overzealous with the InternedString -> Symbol refactoring.
There was a problem hiding this comment.
Maybe make a type SymbolStringContent?
src/librustc/ty/sty.rs
Outdated
There was a problem hiding this comment.
Same here, the intention is to sort the strings themselves.
src/libsyntax/symbol.rs
Outdated
520c3c9 to
fc519bb
Compare
src/libsyntax/symbol.rs
Outdated
There was a problem hiding this comment.
This is a suspicious line.
InternedString is supposed to be simply a wrapper around Rc<str> now not tied to symbol interner, right? InternedString::new for example just creates a new Rc and intern_and_get_ident was removed by this PR.
At the same time decode not only creates an Rc but also pushes it into the interner.
One of them is probably wrong.
There was a problem hiding this comment.
InternedString::new should probably go away.
There was a problem hiding this comment.
It depends on what the remaining uses of InternedString are.
Is it used only for hashing/encoding/etc Symbols as strings? In this case something like SymbolStringContent is probably a better solution and InternedString should go away completely, including new.
There was a problem hiding this comment.
After this PR, InternedString should be semantically equivalent to SymbolStringContent -- we can rename later. I don't think InternedString is that bad of a name.
There was a problem hiding this comment.
I removed InternedString::new.
|
The one other thing I'd like to see is a |
|
☔ The latest upstream changes (presumably #37732) made this pull request unmergeable. Please resolve the merge conflicts. |
be6cd9d to
fbb1713
Compare
|
@eddyb this should be ready to land now -- the second commit and the last three commits new since last review. |
src/librustc/ty/context.rs
Outdated
There was a problem hiding this comment.
Can't this and crate_name above be Symbol?
src/librustc/ty/mod.rs
Outdated
There was a problem hiding this comment.
Seems like original_crate_name should return a Symbol.
src/librustc_driver/lib.rs
Outdated
There was a problem hiding this comment.
Hmm, doesn't this expose HashSet ordering? Even if you were to use a BTreeSet it would expose intern order. Maybe collect strings instead of printing them, to a Vec and just sort that?
src/librustc_metadata/creader.rs
Outdated
There was a problem hiding this comment.
Can't this be written as .to_string() directly?
There was a problem hiding this comment.
Yeah, but no longer relevant after using Symbol instead of String in NativeLibrary.
src/librustc_trans/mir/block.rs
Outdated
There was a problem hiding this comment.
Might make sense to change loc.file.name to Symbol - in fact, any String in the compiler that is not on some transient and/or error path, may benefit from it.
There was a problem hiding this comment.
Agreed, but best for a future PR.
src/libsyntax/symbol.rs
Outdated
There was a problem hiding this comment.
Maybe add a comment about its thread locality.
src/libsyntax/symbol.rs
Outdated
There was a problem hiding this comment.
The pub field is dangerous, you'll have to privatize it before attempting any optimizations (if the fallout is minimal, maybe do it here).
src/libsyntax/symbol.rs
Outdated
There was a problem hiding this comment.
My comment is no longer relevant (we switched from returning Rc to taking a closure).
src/libsyntax/symbol.rs
Outdated
There was a problem hiding this comment.
This is unsound: you keep references to the string contents around.
There was a problem hiding this comment.
I made reset_interner unsafe. It isn't used in rustc itself, but rusti wants to be able to reset thread local state.
There was a problem hiding this comment.
Is there a real point to that? The interner would just have a "warm start".
There was a problem hiding this comment.
Looks like rusti isn't using driver::reset_thread_local_state at all anymore, so probably not (removed).
src/libsyntax/symbol.rs
Outdated
There was a problem hiding this comment.
Eventual optimization opportunity: store the CodeMap in the interner, guarantee this "symbol string contents" type always points to something owned by the interner, use it in place of &str in the lexer, with appropriate slicing & whatnot, and allow interning it without allocating again - although this requires the arena-based interner too.
|
cc @rust-lang/compiler |
6698d2c to
5400c4f
Compare
|
@bors r+ |
|
📌 Commit 5400c4f has been approved by |
|
⌛ Testing commit 5400c4f with merge 531bd42... |
|
@bors r=eddyb |
|
📌 Commit 60b5372 has been approved by |
|
⌛ Testing commit 60b5372 with merge a56532f... |
|
💔 Test failed - auto-win-msvc-64-opt-rustbuild |
|
@bors r=eddyb |
|
📌 Commit a8e86f0 has been approved by |
Clean up `ast::Attribute`, `ast::CrateConfig`, and string interning This PR - removes `ast::Attribute_` (changing `Attribute` from `Spanned<Attribute_>` to a struct), - moves a `MetaItem`'s name from the `MetaItemKind` variants to a field of `MetaItem`, - avoids needlessly wrapping `ast::MetaItem` with `P`, - moves string interning into `syntax::symbol` (`ast::Name` is a reexport of `symbol::Symbol` for now), - replaces `InternedString` with `Symbol` in the AST, HIR, and various other places, and - refactors `ast::CrateConfig` from a `Vec` to a `HashSet`. r? @eddyb
“Clean up `ast::Attribute`, `ast::CrateConfig`, and string interning”
Resolves these errors:
error[E0425]: unresolved name `token::intern`
--> src/lib.rs:27:35
|
27 | reg.register_syntax_extension(token::intern("quickcheck"),
| ^^^^^^^^^^^^^ unresolved name
error[E0425]: unresolved name `token::str_to_ident`
--> src/lib.rs:86:23
|
86 | let check_ident = token::str_to_ident("quickcheck");
| ^^^^^^^^^^^^^^^^^^^ unresolved name
error[E0425]: unresolved name `token::intern_and_get_ident`
--> src/lib.rs:107:34
|
107 | span, cx.meta_word(span, token::intern_and_get_ident("test"))));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ unresolved name
error: aborting due to 3 previous errors
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
“Clean up `ast::Attribute`, `ast::CrateConfig`, and string interning”
Resolves these errors:
error[E0425]: unresolved name `token::intern`
--> src/lib.rs:27:35
|
27 | reg.register_syntax_extension(token::intern("quickcheck"),
| ^^^^^^^^^^^^^ unresolved name
error[E0425]: unresolved name `token::str_to_ident`
--> src/lib.rs:86:23
|
86 | let check_ident = token::str_to_ident("quickcheck");
| ^^^^^^^^^^^^^^^^^^^ unresolved name
error[E0425]: unresolved name `token::intern_and_get_ident`
--> src/lib.rs:107:34
|
107 | span, cx.meta_word(span, token::intern_and_get_ident("test"))));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ unresolved name
error: aborting due to 3 previous errors
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This PR
ast::Attribute_(changingAttributefromSpanned<Attribute_>to a struct),MetaItem's name from theMetaItemKindvariants to a field ofMetaItem,ast::MetaItemwithP,syntax::symbol(ast::Nameis a reexport ofsymbol::Symbolfor now),InternedStringwithSymbolin the AST, HIR, and various other places, andast::CrateConfigfrom aVecto aHashSet.r? @eddyb