7

I want to download single file from my git repository using python.

Currently I am using gitpython lib. Git clone is working fine with below code but I don't want to download entire directory.

import os
from git import Repo
git_url = '[email protected]:/home2/git/stack.git'
repo_dir = '/root/gitrepo/'
if __name__ == "__main__":
    Repo.clone_from(git_url, repo_dir, branch='master', bare=True)
    print("OK")
3
  • What kind of file? Which os? Path of file? Commented Jul 9, 2018 at 6:38
  • 1
    Go with git archive --remote. Commented Jul 9, 2018 at 9:51
  • @ShashankSingh: any c or cpp source file, on windows OS, Path:- master/code/repo/ Commented Jul 9, 2018 at 12:30

5 Answers 5

4

Don't think of a Git repo as a collection of files, but a collection of snapshots. Git doesn't allow you to select what files you download, but allows you to select how many snapshots you download:

git clone [email protected]:/home2/git/stack.git

will download all snapshots for all files, while

git clone --depth 1 [email protected]:/home2/git/stack.git

will only download the latest snapshot of all files. You will still download all files, but at least leave out all of their history.

Of these files you can simply select the one you want, and delete the rest:

import os
import git
import shutil
import tempfile

# Create temporary dir
t = tempfile.mkdtemp()
# Clone into temporary dir
git.Repo.clone_from('[email protected]:/home2/git/stack.git', t, branch='master', depth=1)
# Copy desired file from temporary dir
shutil.move(os.path.join(t, 'setup.py'), '.')
# Remove temporary dir
shutil.rmtree(t)
Sign up to request clarification or add additional context in comments.

5 Comments

It's a collection of snapshots, not changesets. In one sense this does not matter, but in others it does, and since Git lets the implementation show (shine?) through, it matters when using Git.
Ok, I have changed the wording
is there any git command for download the single file without any script?
No. Even the command git archive --remote (which isn't available in gitpython) requires un-tar-ing the output.
It's true that Git won't let you do this, but Github and Bitbucket do. See stackoverflow.com/a/4605068/733092
3

You can also use subprocess in python:

import subprocess

args = ['git', 'clone', '--depth=1', '[email protected]:/home2/git/stack.git']
res = subprocess.Popen(args, stdout=subprocess.PIPE)
output, _error = res.communicate()

if not _error:
    print(output)
else:
    print(_error)

However, your main problem remains.

Git does not support downloading parts of the repository. You have to download all of it. But you should be able to do this with GitHub. Reference

Comments

2

You can use this function to download single file content from specific branch. This code uses only the requests library.

def download_single_file(
    repo_owner: str,
    repo_name: str,
    access_token: str,
    file_path: str,
    branch: str = "main",
    destination_path: str = None,
):
    if destination_path is None:
        destination_path = "./" + file_path

    url = f"https://api.github.com/repos/{repo_owner}/{repo_name}/contents/{file_path}?ref={branch}"

    # Set the headers with the access token and API version
    headers = {
        "Accept": "application/vnd.github+json",
        "Authorization": f"Bearer {access_token}",
    }

    # Send a GET request to the API endpoint
    response = requests.get(url, headers=headers)

    # Check if the request was successful
    if response.status_code == 200:
        # Get the content data from the response
        content_data = response.json()

        # Extract the content and decode it from base64
        content_base64 = content_data.get("content")
        content_bytes = base64.b64decode(content_base64)
        content = content_bytes.decode("utf-8")

        # Set the local destination path

        # Save the file content to the local destination path
        with open(destination_path, "w") as file:
            file.write(content)

        print("File downloaded successfully.")
    else:
        print(
            "Request failed. Check the repository owner, repository name, access token, and API version."
        )

Comments

0

You need to request the raw version of the file! You can get it from raw.github.com

2 Comments

I guess ,he never said github.
My bad then, I thought it was github
0

I don't want to flag this as a direct duplicate, since it does not fully reflect the scope of this question, but part of what Lucifer said in his answer seems the way to go, according to this SO post. In short, git does not allow for a partial download, but certain providers (like GitHub) do, via raw content.
That being said, Python does provide quite a number of different libraries to download, with the best-known being urllib.request.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.