Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Adds function to find candidate occurrences via search#63196

Merged
kritzcreek merged 16 commits intomainfrom
christoph/candidate-occurrences-via-search
Jun 14, 2024
Merged

Adds function to find candidate occurrences via search#63196
kritzcreek merged 16 commits intomainfrom
christoph/candidate-occurrences-via-search

Conversation

@kritzcreek
Copy link
Contributor

@kritzcreek kritzcreek commented Jun 11, 2024

Closes https://linear.app/sourcegraph/issue/GRAPH-640/pre-filter-candidate-files-by-consulting-zoektsearcher

I'm wondering what to do about o11y here. Should I be adding something to the operations, or some more logging?

Test plan

Easiest way to test right now is to generate a syntactic index, upload it, pull up the site-admin api console and hit the occurrences API endpoint with:

query {
  usagesForSymbol(range: {
    repository: "github.com/sourcegraph/sourcegraph"
    path: "docker-images/syntax-highlighter/crates/syntax-analysis/testdata/globals.java"
    revision: "...."
    start: {
      line: 21,
      character: 17
    }, end: {
      line: 21,
      character: 30
    }
  }) {
    nodes {
      usageRange {
        repository,
        revision,
        path,
        range {
          start {
            line,
            character
          }
          end {
            line,
            character
          }
        }
      }
      symbol {
        name,
        provenance
      }
    }
  }
}

@kritzcreek kritzcreek requested a review from a team June 11, 2024 09:42
@cla-bot cla-bot bot added the cla-signed label Jun 11, 2024
@github-actions github-actions bot added team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) team/product-platform labels Jun 11, 2024
@kritzcreek kritzcreek removed the request for review from a team June 11, 2024 10:16
@kritzcreek kritzcreek marked this pull request as draft June 11, 2024 10:16
@kritzcreek kritzcreek force-pushed the christoph/candidate-occurrences-via-search branch from 3b4fcea to 377fdbb Compare June 13, 2024 08:09
@kritzcreek kritzcreek marked this pull request as ready for review June 13, 2024 08:35
@kritzcreek kritzcreek requested a review from a team June 13, 2024 08:46
Comment on lines +1020 to +1023
// Overlapping occurrences should lead to the same display name, but be scored separately.
// (Meaning we just need a single Searcher/Zoekt search)
// TODO: Assert this?
searchSymbol := symbols[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what this means, but if we're only picking the first element of the slice, maybe the function getSyntacticSymbolsAtRange shouldn't be returning a slice?

Comment on lines +51 to +57
_, err = client.Execute(ctx, stream, plan)
if err != nil {
return nil, err
}
Copy link
Contributor

@varungandhi-src varungandhi-src Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should handle a context timeout error in a special way here (or in the outermost loop maybe?) 🤔

@varungandhi-src
Copy link
Contributor

I'm wondering what to do about o11y here. Should I be adding something to the operations, or some more logging?

We should add an operation for each public method (at least, that's the pattern we're following for now). So yes, we should have an operation for SyntacticUsages. For the cross-service call, the search code will handle creating a span, so we're good there. However, it's generally useful to attach extra information about the results to a final event. How many files were matched, total number of matches etc. We should also attach an event related to the query details, it looks like the Execute function doesn't attach any information related to that.

@kritzcreek kritzcreek force-pushed the christoph/candidate-occurrences-via-search branch from a2d5f6c to debb7ba Compare June 13, 2024 13:49
@kritzcreek
Copy link
Contributor Author

I handled most of the feedback, but decided after a while that some of it is not actually required for the MVP stage. I've filed a follow-up issue for me to come back for some of these https://linear.app/sourcegraph/issue/GRAPH-683/syntactic-usages-tech-debt

@kritzcreek kritzcreek force-pushed the christoph/candidate-occurrences-via-search branch from fb34473 to 286f9b0 Compare June 14, 2024 07:38
@kritzcreek kritzcreek enabled auto-merge (squash) June 14, 2024 07:48
Co-authored-by: Varun Gandhi <varun.gandhi@sourcegraph.com>
Copy link
Contributor

@varungandhi-src varungandhi-src left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do another pass after an hour or so.

Comment on lines +1042 to +1050
) ([]struct{}, *SyntacticUsagesError) {
// The `nil` in the second argument is here, because `With` does not work with custom error types.
ctx, trace, endObservation := s.operations.syntacticUsages.With(ctx, nil, observation.Args{Attrs: []attribute.KeyValue{
attribute.Int("repoId", int(repo.ID)),
attribute.String("commit", string(commit)),
attribute.String("path", path),
attribute.String("symbolRange", symbolRange.String()),
}})
defer endObservation(1, observation.Args{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked into this; it looks like endObservation will log the error if we pass in a proper pointer here. So if we really want to return SyntacticUsageError from this function, then we can do the following sequence.

Suggested change
) ([]struct{}, *SyntacticUsagesError) {
// The `nil` in the second argument is here, because `With` does not work with custom error types.
ctx, trace, endObservation := s.operations.syntacticUsages.With(ctx, nil, observation.Args{Attrs: []attribute.KeyValue{
attribute.Int("repoId", int(repo.ID)),
attribute.String("commit", string(commit)),
attribute.String("path", path),
attribute.String("symbolRange", symbolRange.String()),
}})
defer endObservation(1, observation.Args{})
) (_ []struct{}, err *SyntacticUsagesError) {
var erasedError error
// The `nil` in the second argument is here, because `With` does not work with custom error types.
ctx, trace, endObservation := s.operations.syntacticUsages.With(ctx, &erasedError, observation.Args{Attrs: []attribute.KeyValue{
attribute.Int("repoId", int(repo.ID)),
attribute.String("commit", string(commit)),
attribute.String("path", path),
attribute.String("symbolRange", symbolRange.String()),
}})
defer endObservation(1, observation.Args{})
defer func(){ if err != nil { erasedError = *err } }()

// (Meaning we just need a single Searcher/Zoekt search)
searchSymbol := symbolsAtRange[0]

langs, langErr := languages.GetLanguages(path, func() ([]byte, error) { return nil, errors.New("Ambiguous language") })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The function argument allows returning an error because content-fetching might return an error. If we don't have access to the content here, you can pass nil for the second argument. So there won't be any error returned from the API.
  • If there are multiple languages returned, we should do an AND instead of taking the first one.

@kritzcreek kritzcreek disabled auto-merge June 14, 2024 10:37
Co-authored-by: Varun Gandhi <varun.gandhi@sourcegraph.com>
Copy link
Contributor

@varungandhi-src varungandhi-src left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the tech debt issue with more details. I'm OK with merging this as-is for now, but I don't want us to rush stuff like this purely for demo day, we can share a demo in Slack outside of demo day too.

@kritzcreek kritzcreek merged commit f5facfd into main Jun 14, 2024
@kritzcreek kritzcreek deleted the christoph/candidate-occurrences-via-search branch June 14, 2024 10:57
@kritzcreek
Copy link
Contributor Author

but I don't want us to rush stuff like this purely for demo day

Nothing about demo day and everything about the MVP milestone. This code doesn't need to deal with language ambiguity to check its viability, but that's what I want to be laser-focused on.

@kritzcreek
Copy link
Contributor Author

Just want to clarify, I'm not saying I don't appreciate the thorough review, feedback, and discussion. I read every comment and definitely intend to get back to and address the rest of it.
I just decided that the bar for things I want to do right now has to impact the MVP aspect of the project.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) team/product-platform

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants