Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

own: index recent contributors for ownership signals#51188

Merged
coury-clark merged 34 commits intomainfrom
cclark/own/wire-up-indexer
May 3, 2023
Merged

own: index recent contributors for ownership signals#51188
coury-clark merged 34 commits intomainfrom
cclark/own/wire-up-indexer

Conversation

@coury-clark
Copy link
Contributor

@coury-clark coury-clark commented Apr 26, 2023

This PR hooks up the recent contributors indexer with the background job, effectively enabling indexing once the feature flag is switched on.

Test plan

You can start Sourcegraph and let the jobs auto-index, or you can manually control the queue through the own_background_jobs table.

Here is some example output auto-indexing the local dev repos:

[worker] INFO worker.own-repo-indexing-queue.own.background.index.scheduler.recent-contributors background/scheduler.go:84 Scheduling repo indexes for own job {"job-name": "recent-contributors"}
[worker] INFO worker.own-repo-indexing-queue.own.background.index.scheduler.recent-contributors background/scheduler.go:96 Own index job scheduled {"job-name": "recent-contributors", "row-count": 11}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"TraceId": "1e151b9be9a6cb308ab77719c6ad54c0", "SpanId": "e738c4f0d1af8d29", "name": "own_background_worker", "id": 123}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"name": "own_background_worker", "id": 124}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"TraceId": "4a857e335aa28ba2717c951cfff07ba5", "SpanId": "04e3ff4d85cb72bc", "name": "own_background_worker", "id": 125}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"name": "own_background_worker", "id": 126}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"TraceId": "e1b51f47de0dcbfdebd14bfc346f60a2", "SpanId": "0384937856ac05a4", "name": "own_background_worker", "id": 127}
[worker] INFO worker.own-repo-indexing-queue.Handle background/recent_contributors.go:67 commits inserted {"TraceId": "e1b51f47de0dcbfdebd14bfc346f60a2", "SpanId": "4c39331206218a30", "handle": {"count": 0, "repo_id": 6}}
[worker] INFO worker.own-repo-indexing-queue.Handle background/recent_contributors.go:67 commits inserted {"TraceId": "4a857e335aa28ba2717c951cfff07ba5", "SpanId": "f7bbf2aa07a3aef2", "handle": {"count": 0, "repo_id": 5}}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"name": "own_background_worker", "id": 128}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"name": "own_background_worker", "id": 129}
[worker] INFO worker.own-repo-indexing-queue.Handle background/recent_contributors.go:67 commits inserted {"handle": {"count": 0, "repo_id": 4}}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"name": "own_background_worker", "id": 130}
[worker] INFO worker.own-repo-indexing-queue.Handle background/recent_contributors.go:67 commits inserted {"handle": {"count": 0, "repo_id": 8}}
[worker] INFO worker.own-repo-indexing-queue.Handle background/recent_contributors.go:67 commits inserted {"handle": {"count": 1, "repo_id": 1}}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"name": "own_background_worker", "id": 131}
[worker] INFO worker.own-repo-indexing-queue workerutil/worker.go:326 Dequeued record for processing {"name": "own_background_worker", "id": 132}
[worker] INFO worker.own-repo-indexing-queue.Handle background/recent_contributors.go:67 commits inserted {"TraceId": "1e151b9be9a6cb308ab77719c6ad54c0", "SpanId": "b3615fd2c3989b98", "handle": {"count": 3, "repo_id": 11}}
[worker] INFO worker.own-repo-indexing-queue.Handle background/recent_contributors.go:67 commits inserted {"handle": {"count": 0, "repo_id": 9}}

@cla-bot cla-bot bot added the cla-signed label Apr 26, 2023
@coury-clark coury-clark self-assigned this May 1, 2023
@coury-clark coury-clark requested a review from cbart May 1, 2023 22:22
@coury-clark coury-clark requested a review from sashaostrikov May 1, 2023 22:22
"github.com/sourcegraph/sourcegraph/internal/types"
)

func Test_RecentContributorIndexFromGitserver(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is largely the same test from the store, but feeding a mock gitserver through to the DB. It effectively tests the entire flow, except the queue.

}

func (c *clientImplementor) CommitLog(ctx context.Context, repo api.RepoName, after time.Time) ([]CommitLog, error) {
args := []string{"log", "--pretty=format:%H<!>%ae<!>%an<!>%ad", "--name-only", "--topo-order", "--no-merges"}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added no merges, does that match your thinking @cbart?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes total sense to me

@coury-clark coury-clark changed the title cclark/own/wire up indexer own: index recent contributors for ownership signals May 1, 2023
@coury-clark coury-clark marked this pull request as ready for review May 1, 2023 23:24
@sourcegraph-bot
Copy link
Contributor

sourcegraph-bot commented May 1, 2023

Codenotify: Notifying subscribers in CODENOTIFY files for diff 931d6ad...afdd075.

Notify File(s)
@indradhanush internal/gitserver/client.go
internal/gitserver/commands.go
internal/gitserver/commands_test.go
internal/gitserver/mocks_temp.go
@sashaostrikov internal/gitserver/client.go
internal/gitserver/commands.go
internal/gitserver/commands_test.go
internal/gitserver/mocks_temp.go

Copy link
Contributor

@sashaostrikov sashaostrikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I left some comments, I guess the main thing is to add some test coverage for a new gitserver command

}

func (h *handler) Handle(ctx context.Context, logger log.Logger, record *Job) error {
func (h *handler) Handle(ctx context.Context, lgr log.Logger, record *Job) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we have a bit of inconsistent logger naming, in structs it is logger, in function parameters I can see lgr, should we unify it?

I guess lgr is to fix the prefix of logger clashing with the imported log package? At least this is what Goland usually complains to me about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, when you have local variables that shadow the package name. It's annoying because sometimes we have the import alias log and other times logger, so I've been trying something new with lgr to always have a consistent import logger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, makes sense

}

func (c *clientImplementor) CommitLog(ctx context.Context, repo api.RepoName, after time.Time) ([]CommitLog, error) {
args := []string{"log", "--pretty=format:%H<!>%ae<!>%an<!>%ad", "--name-only", "--topo-order", "--no-merges"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes total sense to me

}

func (c *clientImplementor) CommitLog(ctx context.Context, repo api.RepoName, after time.Time) ([]CommitLog, error) {
args := []string{"log", "--pretty=format:%H<!>%ae<!>%an<!>%ad", "--name-only", "--topo-order", "--no-merges"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by review

Have we considered benchmarking this function to see how this would perform for monorepos with 500k to 1M commits? Asking because we recently saw a bunch of customer issues where load was the primary bottleneck and it seems like this command can be pretty open ended and in the worst case list all the repo's commits?

I'm trying to anticipate problems that could occur in the future but if you all don't see an issue here - feel free to ignore.

Also the benchmarking (if it makes sense) is not a blocking review for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@indradhanush yep, definitely something we are going to test once this can go onto a customer-like instance. Most likely we will end up paging this anyway so we aren't always running the log from HEAD (this is how Code Insights works as well, we log from older commits to get reference hashes)

coury-clark and others added 4 commits May 2, 2023 10:42
Co-authored-by: Alex Ostrikov <alex.ostrikov@sourcegraph.com>
Co-authored-by: Alex Ostrikov <alex.ostrikov@sourcegraph.com>
@coury-clark coury-clark requested a review from sashaostrikov May 2, 2023 20:55
})
}

func Test_CommitLog(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🚀🚀

Copy link
Contributor

@sashaostrikov sashaostrikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit::shipit::shipit:

@coury-clark coury-clark merged commit c08e946 into main May 3, 2023
@coury-clark coury-clark deleted the cclark/own/wire-up-indexer branch May 3, 2023 14:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants