wit-parser: Remove duplicate spans from UnresolvedPackage#2441
wit-parser: Remove duplicate spans from UnresolvedPackage#2441alexcrichton merged 3 commits intobytecodealliance:mainfrom
wit-parser: Remove duplicate spans from UnresolvedPackage#2441Conversation
encapsulate world item span adjustment simplify highlighting update other crates to build/test correctly
alexcrichton
left a comment
There was a problem hiding this comment.
This looks great to me, thank you again for your work on this it's super appreciated! Two minor comments below, but otherwise happy to merge this 👍
| "type": 4 | ||
| "type": { | ||
| "id": 4 | ||
| } |
There was a problem hiding this comment.
Do you know how difficult it would be to configure serde to preserve the old encoding? The JSON output is known to be unstable-ish but if we can avoid changing it with a bit of configuration that's also worth it I think. If it's too difficult to preserve, though (e.g. requiring complicated custom impls) then this is fine to keep
There was a problem hiding this comment.
It required a custom variant serializer but it's just a simple wrapper on the existing serialize_id that just skips the span parameter. So I think this is ok.
| pub enum Span { | ||
| /// No source location available (e.g., decoded from binary). | ||
| #[default] | ||
| Unknown, | ||
| /// A range in the source text. | ||
| Range { | ||
| /// The start of the range. | ||
| start: u32, | ||
| /// The end of the range (exclusive). | ||
| end: u32, | ||
| }, | ||
| } |
There was a problem hiding this comment.
In the interest of keeping this structure small-ish (8-bytes instead of this current representation of 12-bytes), could this retain the Span { u32, u32 } field representation? My thinking is then:
- The
startandendfields would become private, retaining the function-based accessors. Span::UNKNOWNwould beSpan { start: u32::MAX, end: u32::MAX }which is a sentinel valueis_knownwould test for start/end being not equalu32::MAX- There'd be a
Span::new(a, b)constructor which would assert a/b are not both equal tou32::MAXand then would return a "real span"
There was a problem hiding this comment.
Yes I like this a lot better. Let me know if you want any changes to the assertions made in the impl Span.
|
Looks great, thanks! |
#2418
This PR removes redundant parallel span vectors from
UnresolvedPackageby storing spans directly on the items they describe. This follows up on #2419 which added span fields toInterface,World,TypeDef, andFunction.I recognize that this is a relatively large change so if there is a desire to split it up into separate pieces of work I am amenable to that.
Removed from UnresolvedPackage
interface_spans: Vec<InterfaceSpan>- TheInterfaceSpanstored a span for itself and spans for the functions contained within it. These are now stored inInterface.spanandFunction.spanrespectively.world_spans: Vec<WorldSpan>- TheWorldSpanstruct stored a span for itself and forimports,exports, andincludes. World spans now come fromWorld.span. Import/export spans are now stored directly onWorldItemvariants. Include spans are stored in the newWorldIncludestruct.type_spans: Vec<Span>- Type definition spans now come fromTypeDef.span.The following span storage remains in
UnresolvedPackageand could potentially be addressed in follow-up work:package_name_span: Span- Could be moved to a newspanfield onPackageNameunknown_type_spans: Vec<Span>- For unresolved type references during parsingforeign_dep_spans: Vec<Span>- For foreign dependency referencesrequired_resource_types: Vec<(TypeId, Span)>- For resource type requirementsThese were left as-is to keep this PR focused on the main deduplication work.
Structural changes
<Item>.span- All item span properties were changed fromOption<Span>toSpan(usingSpan::Unknownfor the empty case) primarily to avoid a bunch of expect/unwraps in AST code which is guaranteed to have range spans. Those hypothetical unwraps are now relegated to panics in thestart/endgetters which are guarded byis_knowncalls in a few specific cases. IfOption<Span>is the preferred representation then I can change this back. Besides reverting, another option could be to just return0; however, this could potentially mask logical errors.WorldItemenum - Addedspan: Spanfield to theInterfaceandTypevariants. ChangedType(TypeId)toType { id: TypeId, span: Span }.WorldIncludestruct - New struct that consolidatesstability,id,names, andspaninto a single type. ReplacesWorld.includes: Vec<(Stability, WorldId)>andWorld.include_names: Vec<Vec<IncludeName>>.SourceMap::highlight_span- New method that takes aSpandirectly and returnsOption<String>, handling theis_known()check internally. Cleaning up error highlighting was motivated by the change fromOption<Span>toSpanbecause I wanted to centralize and reduce the number of calls tois_known.Changes in other crates
Pattern matches on
WorldItemneeded updating in:wit-component- encoding, validation, printing, metadatawit-dylib- world item iterationwit-encoder- conversion from parser typesThese changes are mechanical - updating
WorldItem::Type(id)toWorldItem::Type { id, .. }and adding..toWorldItem::Interfacepatterns.Additionally some test fixtures needed to be regenerated to pick up the new structure of the types.