Support shallow clones with Git#772
Merged
ahal merged 10 commits intotaskcluster:mainfrom Oct 17, 2025
Merged
Conversation
e4b6b56 to
bc5cc0f
Compare
7cba9a9 to
d58fa3c
Compare
edeb769 to
5387210
Compare
d609819 to
1848a5e
Compare
50f6468 to
3a746a0
Compare
jcristau
reviewed
Oct 9, 2025
dde7615 to
3f72a12
Compare
jcristau
approved these changes
Oct 17, 2025
src/taskgraph/run-task/run-task
Outdated
|
|
||
| # If we have a shallow clone and specific commit, we need to fetch it too. | ||
| if shallow and head_rev and head_rev != head_ref: | ||
| git_fetch( |
Contributor
There was a problem hiding this comment.
nit: ideally we'd call git fetch just once against the head repo, i.e. combine this with the head_ref fetch
jcristau
approved these changes
Oct 17, 2025
| if not targets or shallow: | ||
| # If head_ref wasn't provided, we fallback to head_rev. If we have a | ||
| # shallow clone, head_rev needs to be fetched independently regardless. | ||
| targets.append(head_rev) |
Contributor
There was a problem hiding this comment.
Should we assert somewhere that if shallow is True then we have a head_rev?
Collaborator
Author
There was a problem hiding this comment.
Huh, good point. I guess a head_rev isn't necessary for shallow clones either though.. I'll fix this up.
…head_rev This makes the naming consistent with what we use in .taskcluster.yml and the rest of Taskgraph. Previously, I always had to look up where "ref" and "commit" / "revision" were coming from to double check they were the values I was expecting. This rename makes that much more obvious.
If the condition in the if statement is true, then we've already fetched ref from head_repo. There's no need to do so again.
BREAKING CHANGE: `base_ref` will no longer be fetched or checked out by run-task Taskgraph uses base_rev anyway for computing files changed, so there's no need to additionally fetch base_ref. Some tasks may need to be updated to not rely on base_ref being present in the local clone.
BREAKING CHANGE: omitting `head_ref` no longer fetches all heads Previously we were fetching all heads in this case so that we could then run `git checkout <head_rev>` successfully. But it's much faster to just explicitly fetch `<head_rev>` in the first place. This also refactors `git_fetch` to be able to fetch multiple targets at once.
This fixes the case where head_ref is passed in with a `refs/heads` prefix.
Shallow clones yield a massive improvement to clone performance, at the expense of making it tricky to determine the files that were modified.
`git log BASE..HEAD` says, show me commits reachable from HEAD, but not reachable from BASE. In a shallow clone where we only fetch BASE and HEAD (which is what run-task does), this means the command will only return `HEAD`. In otherwords, we're only returning files changed by the tip commit of the push and ignoring everything else. By switching to `git diff BASE HEAD`, we're instead comparing the snapshots of both revisions. Sometimes this is what we want, e.g for force pushes, it'll be the interdiff of files modified between the two pushes (though some developers might expect it to contain the files modified since the merge base). Sometimes it's not what we want, e.g for PRs, it'll be the files changed between the PR and the latest commit on `main`. Either way, this behaviour is at least somewhat more accurate than git log when we don't have full history. Likely we'll need to fetch the proper changed files using the Github API in the future, but for now this is better than nothing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.