{"id":57374,"date":"2023-12-22T13:12:00","date_gmt":"2023-12-22T13:12:00","guid":{"rendered":"https:\/\/www.askpython.com\/?p=57374"},"modified":"2025-04-10T20:50:52","modified_gmt":"2025-04-10T20:50:52","slug":"gitpython-list-all-files","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python-modules\/gitpython-list-all-files","title":{"rendered":"Using GitPython to List All Files Affected by a Commit"},"content":{"rendered":"\n<p>Have you ever needed to see exactly which files were changed in a Git commit? When working on a project with a large and complex codebase, it can be invaluable to query commit history and analyze changes to specific files over time.<\/p>\n\n\n\n<p>In this post, we\u2019ll explore how to use the Python Git library&nbsp;<strong>GitPython<\/strong>&nbsp;to easily get a list of all files affected by a given commit. We\u2019ll look at practical examples using GitPython so you can directly apply this knowledge in your own projects. Let\u2019s dive in!<\/p>\n\n\n\n<p><strong><em>Also read: <a href=\"https:\/\/www.askpython.com\/resources\/learning-python-growing-programming-knowledge\" data-type=\"post\" data-id=\"53690\">Learning Python? Things You Should Know with Growing Programming Knowledge<\/a><\/em><\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"introduction-to-gitpython\">What is GitPython?<\/h2>\n\n\n\n<p><a href=\"https:\/\/gitpython.readthedocs.io\/\" target=\"_blank\" rel=\"noopener\">GitPython<\/a>&nbsp;is a powerful Python library that provides access to Git objects and repositories. It allows you to leverage Git functionality in your Python applications and scripts.<\/p>\n\n\n\n<p>Some key things you can do with GitPython:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Open local and remote repositories<\/li>\n\n\n\n<li>Inspect commits, trees, blobs<\/li>\n\n\n\n<li>Traverse commits and branches<\/li>\n\n\n\n<li>Compare file changes between commits<\/li>\n\n\n\n<li>And more!<\/li>\n<\/ul>\n\n\n\n<p>By using GitPython, we avoid having to call Git command line operations and can directly access repository data structures in our Python code.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"installing-gitpython\">Installing GitPython<\/h2>\n\n\n\n<p><strong>GitPython can be installed via pip:<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\npip install GitPython\n<\/pre><\/div>\n\n\n<p><strong>Or <a href=\"https:\/\/www.askpython.com\/python\/examples\/check-anaconda-version-on-windows\" data-type=\"post\" data-id=\"53005\">conda<\/a>:<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nconda install -c conda-forge gitpython\n<\/pre><\/div>\n\n\n<p>That will grab the latest version and all dependencies needed to start working with Git repos from Python.<\/p>\n\n\n\n<p><strong><em>Also read: <a href=\"https:\/\/www.askpython.com\/python\/conda-vs-pip\" data-type=\"post\" data-id=\"16381\">Conda vs Pip: Choosing your Python package manager<\/a><\/em><\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"setup-repo-access\">Setup Repo Access<\/h2>\n\n\n\n<p>To open a Git repo, first import the\u00a0Repo\u00a0class:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom git import Repo\n\n<\/pre><\/div>\n\n\n<p>Then open a repository:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nrepo = Repo(&quot;\/path\/to\/repository\u201d)\n\n<\/pre><\/div>\n\n\n<p>You can open from a local file system path or remote URL.<\/p>\n\n\n\n<p>Some examples:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# Local repo\nrepo = Repo(&quot;\/users\/project\/code\u201d) \n\n# Clone remote\nrepo = Repo.clone_from(&quot;https:\/\/github.com\/user\/repo.git&quot;)  \n\n<\/pre><\/div>\n\n\n<p>This&nbsp;<code>repo<\/code>&nbsp;object gives us access to all GitPython methods and the underlying Git data.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"get-a-commit-object\">Get a Commit Object<\/h2>\n\n\n\n<p>Next we\u2019ll see how to grab a specific commit from the repository\u2019s history.<\/p>\n\n\n\n<p>GitPython has powerful commit traversal and querying abilities. Common ways to get commits:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# From branch tip \nhead_commit = repo.head.commit\n\n# By SHA hex string\ncommit = repo.commit(&quot;0737db7\u201d)  \n\n# By index\/ref name\ncommit = repo.commit(&quot;my-feature-branch&quot;)\n\n<\/pre><\/div>\n\n\n<p>There are also functions like&nbsp;<code>repo.iter_commits()<\/code>&nbsp;and&nbsp;<code>repo.commits()<\/code>&nbsp;to iterate through commits.<\/p>\n\n\n\n<p>For our example here, we\u2019ll just grab the head commit of our repo:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ncommit = repo.head.commit\nprint(commit)\n\n# &lt;git.Commit &quot;5d466f4a3ca9995eb7e3ac36e27e4c0872d6b3b6&quot;&gt;\n\n<\/pre><\/div>\n\n\n<p>Which gives us a commit object we can now inspect.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"list-changed-files\">List Changed Files using GitPython<\/h2>\n\n\n\n<p>So we have our&nbsp;<code>commit<\/code>&nbsp;instance ready. Next up we can get a list of files that were changed in this commit with&nbsp;<code>commit.stats.files<\/code>:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfiles = commit.stats.files\n\nprint(files)\n# &#x5B;&#039;file1.py&#039;, &#039;scripts\/helper.py&#039;, &#039;README.md&#039;] \n\n<\/pre><\/div>\n\n\n<p>Behind the scenes, this runs\u00a0<code>git diff-tree<\/code>\u00a0to compare the commit to its first parent commit, and collects all affected files.<\/p>\n\n\n\n<p>This includes files that were:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Modified<\/li>\n\n\n\n<li>Added<\/li>\n\n\n\n<li>Renamed<\/li>\n\n\n\n<li>Copied<\/li>\n\n\n\n<li>Or deleted<\/li>\n<\/ul>\n\n\n\n<p>So it gives us the full set of file changes in that commit.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"handling-renames-and-deletions\">Handling Renames and Deletions<\/h2>\n\n\n\n<p>When accessing file stats, there are some special cases around detecting renames and deletions worth noting:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Renamed files<\/strong>&nbsp;will show up in the file list under their new path only. So if&nbsp;<code>helper.py<\/code>&nbsp;was renamed to&nbsp;<code>helper_module.py<\/code>, you would only see the updated filename&nbsp;<code>helper_module.py<\/code>&nbsp;in the file list for that commit.<\/li>\n\n\n\n<li><strong>Deleted files<\/strong>&nbsp;are NOT included in the main file list. But you can access deleted files with:<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndeletes = commit.stats.files.get(&quot;del&quot;)\n\n# &#x5B;&#039;old_script.py&#039;]\n<\/pre><\/div>\n\n\n<p>So combining the&nbsp;<code>files<\/code>&nbsp;list and&nbsp;<code>deletes<\/code>&nbsp;list gives us the complete set of changed files in the commit.<\/p>\n\n\n\n<p><strong><em>Also read: <a href=\"https:\/\/www.askpython.com\/python\/examples\/rename-a-file-directory-python\" data-type=\"post\" data-id=\"6465\">How to Rename a File\/Directory in Python?<\/a><\/em><\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"filtering-by-change-type\">Filtering By Change Type<\/h2>\n\n\n\n<p>When you retrieve the list of committed files, they will include files with any type of change:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Additions<\/li>\n\n\n\n<li>Modifications<\/li>\n\n\n\n<li>Copies<\/li>\n\n\n\n<li>Renames<\/li>\n\n\n\n<li>Deletions<\/li>\n<\/ul>\n\n\n\n<p>You can also filter to only files that were added, modified, etc.<\/p>\n\n\n\n<p>For example, to only see&nbsp;<em>modified<\/em>&nbsp;files:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nmodified_files = commit.stats.files.get(&quot;mod&quot;) \n\nprint(modified_files)\n# &#x5B;&#039;helper.py&#039;, &#039;README.md&#039;]\n<\/pre><\/div>\n\n\n<p>Options for filtering by change type:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>added<\/code>&nbsp;&#8211; New files<\/li>\n\n\n\n<li><code>copied<\/code>&nbsp;&#8211; Copied files<\/li>\n\n\n\n<li><code>modified<\/code>&nbsp;&#8211; Modified files<\/li>\n\n\n\n<li><code>renamed<\/code>&nbsp;&#8211; Renamed files<\/li>\n<\/ul>\n\n\n\n<p>This can be useful for reviewing changes focused on one change type.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"performance-considerations\">Performance Considerations when Using GitPython<\/h2>\n\n\n\n<p>Calling&nbsp;<code>commit.stats.files<\/code>&nbsp;is very convenient. But under the hood, there is a lot of diffing and comparison to generate complete file stats.<\/p>\n\n\n\n<p>For large commits or repos with long histories, getting full file stats can take a while.<\/p>\n\n\n\n<p>If you&nbsp;<em>only need the file names<\/em>, there is a faster alternative \u2014 use&nbsp;<code>git show<\/code>&nbsp;with name status:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ncommit_files = repo.git.show(commit, name_only=True, format=&quot;%n&quot;).splitlines()  \n<\/pre><\/div>\n\n\n<p>This just prints the filenames without deeper analysis, so is much quicker than comparing full file trees.<\/p>\n\n\n\n<p>The tradeoff here is you only get the names without change types or detects around renames\/deletions like&nbsp;<code>commit.stats.files<\/code>&nbsp;provides.<\/p>\n\n\n\n<p>But when iterating through hundreds of commits, for example, it can make your script much faster!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"full-script-example\">Full Script Example for GitPython<\/h2>\n\n\n\n<p>Let\u2019s walk through a full script to solidify the concepts:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom git import Repo\nimport datetime\n\n# Set commit author date cutoff \nONE_WEEK = 7*24*60*60 # Unix timestamp\nlast_week = datetime.datetime.now() - datetime.timedelta(seconds=ONE_WEEK)\n\n# Open repository\nrepo = Repo(&quot;\/path\/to\/my\/repo\u201d)  \n\nprint(&quot;Getting last week&#039;s commits...&quot;)\n\n# Get all commits in last week\ncommits = repo.iter_commits(since=last_week)\n\nfor commit in commits:\n    print(f&quot;Commit: {commit.hexsha}, Author: {commit.author.name}&quot;)\n    \n    # Get files changed in commit \n    files = commit.stats.files\n    print(f&quot;Changed files: {files}&quot;)\n    \n    # Get deleted files \n    deletes = commit.stats.files.get(&quot;del&quot;) \n    print(f&quot;Deleted files: {deletes}&quot;)\n    \n    # Filter only modified \n    modified = commit.stats.files.get(&quot;mod&quot;)\n    print(f&quot;Modified files: {modified}&quot;) \n    \nprint(&quot;Script complete!&quot;)   \n\n<\/pre><\/div>\n\n\n<p><strong>Running this would print output like:<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nGetting last week&#039;s commits...\nCommit: 467de124e, Author: John\nChanged files: &#x5B;&#039;helper.py&#039;, &#039;scripts\/process.py&#039;]\nDeleted files: &#x5B;] \nModified files: &#x5B;&#039;helper.py&#039;]\n\nCommit: 9b1aff2fc, Author: Sarah\nChanged files: &#x5B;&#039;README.md&#039;, &#039;docs\/quickstart.md&#039;] \nDeleted files: &#x5B;]\nModified files: &#x5B;&#039;README.md&#039;, &#039;docs\/quickstart.md&#039;]\n\nScript complete!\n\n<\/pre><\/div>\n\n\n<p>So here we:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Found commits from the last week<\/li>\n\n\n\n<li>Printed commit SHA and author name<\/li>\n\n\n\n<li>Listed total changed files<\/li>\n\n\n\n<li>Checked for deleted files<\/li>\n\n\n\n<li>Filtered to only display modified files<\/li>\n<\/ul>\n\n\n\n<p>This demonstrates applying the GitPython techniques covered here to efficiently analyze commits and files changed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"recap\">Summary<\/h2>\n\n\n\n<p>Let\u2019s recap what we covered:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>GitPython<\/strong>&nbsp;provides Python access to Git repos<\/li>\n\n\n\n<li>Easily get commit objects to traverse history<\/li>\n\n\n\n<li>Use\u00a0commit.stats.files\u00a0to list all changed files<\/li>\n\n\n\n<li>Check for deleted files with\u00a0deletes\u00a0list<\/li>\n\n\n\n<li>Filter file changes by type like\u00a0modified<\/li>\n\n\n\n<li>Option to use faster\u00a0git show\u00a0to just get names<\/li>\n<\/ul>\n\n\n\n<p>With these GitPython building blocks, you can gain powerful insights into commit histories and file changes in your Git repositories.<\/p>\n\n\n\n<p>Whether it\u2019s reviewing recent commits on a branch, analyzing diffs for a troublesome merge, or extracting commit metadata &#8211; GitPython has you covered! <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Have you ever needed to see exactly which files were changed in a Git commit? When working on a project with a large and complex codebase, it can be invaluable to query commit history and analyze changes to specific files over time. In this post, we\u2019ll explore how to use the Python Git library&nbsp;GitPython&nbsp;to easily [&hellip;]<\/p>\n","protected":false},"author":77,"featured_media":64144,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-57374","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python-modules"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/57374","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/77"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=57374"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/57374\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/64144"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=57374"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=57374"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=57374"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}