Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Chore: refactoring occurrence indexing#63473

Merged
camdencheek merged 5 commits intomainfrom
cc/refactor-occurrences
Jun 28, 2024
Merged

Chore: refactoring occurrence indexing#63473
camdencheek merged 5 commits intomainfrom
cc/refactor-occurrences

Conversation

@camdencheek
Copy link
Member

@camdencheek camdencheek commented Jun 25, 2024

This is a bit of refactoring as a followup to #63217. The goal is to unify how we use Occurrences so that we can use the same interface for both syntax highlighting data and occurrence data from the new API. An additional goal is to encode more of the invariants in the type system so it's more difficult to use incorrectly.

The primary entrypoint to reviewing this PR is the new OccurrenceIndex class, which extends an array of occurrences to add a line index, enforce the non-overlapping and sorted invariants, and add helper methods for common operations using the index.

Stacked on #63217 (and thus blocked by that)

Test plan

Existing unit tests (though there are not many 😬), and manually compared the behavior between S2 and local for a variety of files with different indexing states, file types, and highlighting availability. There was no visible change in behavior.

@cla-bot cla-bot bot added the cla-signed label Jun 25, 2024
Comment on lines 536 to 557
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to OccurrenceIndex.next

Comment on lines 14 to 16
Copy link
Member Author

@camdencheek camdencheek Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved onto the IndexedCodeGraphData type which extends the type from GraphQL so CodeGraphData better matches the API shape

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added OccurrenceIndex to capture the invariants required by the line index and the various functions that operate on the index. Conveniently, this type can be generic over both syntax highlighting occurrences and occurrences from the code intel API.

@camdencheek camdencheek force-pushed the cc/refactor-occurrences branch 2 times, most recently from 544d404 to 8d4c12f Compare June 25, 2024 19:11
Comment on lines 78 to 97
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly moved to OccurrenceIndex, inlined the remainder

Comment on lines 114 to 145
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to OccurrenceIndex

Comment on lines 6 to 35
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to where we filter to interactive syntax occurrences

Comment on lines 57 to 76
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced by OccurrenceIndex.atPosition()

@camdencheek camdencheek marked this pull request as ready for review June 25, 2024 19:36
@camdencheek camdencheek requested a review from a team June 25, 2024 20:00
Copy link
Contributor

@fkling fkling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a clever way to write

let low = 0
let high = this.length

Don't know if engines can optimize this to not allocate an array. But maybe it's not a big deal either.

FWIW, if you want to have a single let statement you could also just write

let low = 0, high = this.length

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, yes. I shouldn't be clever 🙂

Comment on lines 146 to 150
Copy link
Contributor

@fkling fkling Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could go even further to unify this: Have a separate facet for interactive occurrences, have codeGraphData and syntaxHighlight both contribute to that facet, and use CodeMirror's built-in extension priority system to determine the order of the data (which also works for facet (inputs) because they are extensions too). We can decide whether the facet should "return" all provided occurrences or just return the first (high pri) one.

Then this could look like this:

const interactiveOccurrences = Facet.define<OccurrenceIndex, OccurrenceIndex>({
  combine(values) {
    return values[0] ?? new OccurrenceIndex([])
  },
})

Then we can just use it as

const occurrences = state.facet(interactiveOccurrences)

And the input would be provided with something like this:

// Not sure we still need to have the facet map `CodeGraphData` to `IndexCodeGraphData` if we do that
export const codeGraphData = Facet.define<CodeGraphData[], IndexedCodeGraphData[]>({
    static: true,
    combine: values =>
        values[0]?.map(data => ({ ...data, occurrenceIndex: new OccurrenceIndex(data.occurrences) })) ?? [],
    })
    // Maybe we want to only use precise data here, idk
    enable: facet => interactiveOccurrences.computeN([facet], state => state.facet(facet).map(datum => datum.occurrences)),
})
// Similar for syntaxHighlight

Then we just need to ensure that the codeGraphData facet has a higher priority than the syntaxHighlight one.


Side note: This is how the whole hovercard system works too. There is a single 'show hovercard' facet and multiple extensions that provide input for it: Actual hover (with mouse), keyboard activated hovercard and pinned hovecards. Those sources are ordered by priority and the facet simply picks the first one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant! This was the direction I was attempting to move us in, but didn't quite get that far. Will leave this as a follow up, but totally agreed on the vision.

Base automatically changed from cc/occurrences-api to main June 28, 2024 00:26
@camdencheek camdencheek force-pushed the cc/refactor-occurrences branch from 804666c to cfb2298 Compare June 28, 2024 00:42
@camdencheek camdencheek enabled auto-merge (squash) June 28, 2024 00:43
@camdencheek camdencheek disabled auto-merge June 28, 2024 00:43
@camdencheek camdencheek enabled auto-merge (squash) June 28, 2024 00:43
@camdencheek camdencheek merged commit 6fd099e into main Jun 28, 2024
@camdencheek camdencheek deleted the cc/refactor-occurrences branch June 28, 2024 02:46
camdencheek added a commit that referenced this pull request Jun 28, 2024
This implements the debug view for code intel ranges. Since we're doing
work here, it's very useful to be able scan the info while exploratory
testing and debugging.

Note that this does _not_ use the snapshot API. After #63473, everything
uses occurrences, so rather than rely on another API request, the only
argument to the debug extension is a set of occurrences. This is
particularly nice because it would be very easy to do things like:
- Show the occurrences as calculated from syntax highlighting data
- Show the occurrences before and after we make them non-overlapping
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants