-1

I have a Git repository URL and a branch name.

Using GitPython, how do I get all the commits from the branch?

1
  • my working way commits = list(repo.iter_commits(f"{branch_name}~3..{branch_name}", **{'max-count': 3})) Commented Jan 24 at 9:07

3 Answers 3

1

From https://gitpython.readthedocs.io/en/stable/tutorial.html

Meet the Repo type

The first step is to create a git.Repo object to represent your repository.

from git import Repo
# rorepo is a Repo instance pointing to the git-python repository.

# For all you know, the first argument to Repo is a path to the repository

# you want to work with

repo = Repo(self.rorepo.working_tree_dir)

assert not repo.bare

In the above example, the directory self.rorepo.working_tree_dir equals /Users/mtrier/Development/git-python and is my working repository which contains the .git directory. You can also initialize GitPython with a bare repository.

...

...

The Commit object

Commit objects contain information about a specific commit. Obtain commits using references as done in Examining References or as follows.

Obtain commits at the specified revision

repo.commit('master')

repo.commit('v0.8.1')

repo.commit('HEAD~10')

...

I would suggest reading the tutorial I quoted and at least the entire The Commit Object section of it. (https://gitpython.readthedocs.io/en/stable/tutorial.html#the-commit-object)

Sign up to request clarification or add additional context in comments.

2 Comments

How do you do this without the repo existing on your local disk? I don't want to clone the repo, just read the branch commits from the Github repository.
@mezamorphic I suppose you could first clone the repo, which would probably be easier but you could also continue using just GitPython, for doing that I would suggest checking out what they mention about remote in the same tutorial in the Advanced Repo Usage section of the Meet The Repo section and you should also check out this relevant documentation: gitpython.readthedocs.io/en/stable/…
0
  1. in shell: clone remote to local repo

    git clone url
    
  2. in Python: list all commits

    r = git.Repo("path/to/directory")
    all_commits = list(r.iter_commits(branch_name))
    

If you only want the commits that were changed between your branch and another, you can use

all_commits = list(r.iter_commits(f"{base_branch}...{branch_name}"))

instead.

Comments

0

If you wanted for example to get monthly updates, you can use something like this:

def get_commits_by_month(repo_path):
    try:
        # Initialize the Git repository object
        repo = git.Repo(repo_path)

        # Dictionary to store commits: {user: {year_month: count}}
        commits_by_user = {}

        # Get all commits from the repository
        commits = list(repo.iter_commits())

        # Process each commit
        for commit in commits:
            # Get author name and commit date
            author = commit.author.name
            commit_date = datetime.fromtimestamp(commit.authored_date)
            # Format year-month key (e.g., "2023-02")
            year_month = commit_date.strftime("%Y-%m")

            # Increment counter for this user and month
            if author not in commits_by_user:
                commits_by_user[author] = {}
            if year_month not in commits_by_user[author]:
                commits_by_user[author][year_month] = 0
            commits_by_user[author][year_month] += 1

        # Print results sorted by author
        print("\nCommit Counts by User and Month:")
        print("-" * 40)

        for author in sorted(commits_by_user.keys()):
            print(f"\nAuthor: {author}")
            # Sort months chronologically
            for year_month in sorted(commits_by_user[author].keys()):
                count = commits_by_user[author][year_month]
                print(f"  {year_month}: {count} commits")

            # Print total for this author
            total_commits = sum(commits_by_user[author].values())
            print(f"  Total commits: {total_commits}")

        # Print overall statistics
        total_all_commits = sum(sum(months.values()) for months in commits_by_user.values())
        print("\n" + "-" * 40)
        print(f"Total commits across all users: {total_all_commits}")
        print(f"Number of contributors: {len(commits_by_user)}")

    except git.exc.InvalidGitRepositoryError:
        print(f"Error: '{repo_path}' is not a valid Git repository")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

Feel free to tweak this exactly to how you want to use the commits.

This is another good option:

import git
from datetime import datetime
from dateutil.relativedelta import relativedelta
from collections import defaultdict

def get_user_commits_by_month(repo_path):
    commit_data = defaultdict(lambda: defaultdict(int))  # Nested defaultdict for easier handling
    
    repo = git.Repo(repo_path)
    
    today = datetime.today()
    one_year_ago = today - relativedelta(years=1)

    # Use `since` to optimize commit retrieval
    commits = repo.iter_commits(since=one_year_ago.strftime("%Y-%m-%d"))

    for commit in commits:
        commit_date = commit.authored_datetime
        commit_month = commit_date.strftime('%Y-%m')  # Use YYYY-MM format for proper grouping
        commit_author = commit.author.name

        commit_data[commit_author][commit_month] += 1

    # Print results in a readable format
    print(f"{'Author':<20} {'Month':<10} {'Commits':<10}")
    print("-" * 40)
    
    for author in sorted(commit_data):
        for month in sorted(commit_data[author]):
            print(f"{author:<20} {month:<10} {commit_data[author][month]:<10}")

# Call the function with your repo path
repo_path = <path>
get_user_commits_by_month(repo_path)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.