Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: git/git
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: ad1260b6c994f7c0f9c259bd39f39979f7f4ecc2
Choose a base ref
...
head repository: git/git
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 596b5e77c960cc57ad2e68407b298411ec5e8cb8
Choose a head ref
  • 8 commits
  • 11 files changed
  • 3 contributors

Commits on Nov 3, 2021

  1. test-genzeros: allow more than 2G zeros in Windows

    d5cfd14 (tests: teach the test-tool to generate NUL bytes and
    use it, 2019-02-14), add a way to generate zeroes in a portable
    way without using /dev/zero (needed by HP NonStop), but uses a
    long variable that is limited to 2^31 in Windows.
    
    Use instead a (POSIX/C99) intmax_t that is at least 64bit wide
    in 64-bit Windows to use in a future test.
    
    Signed-off-by: Carlo Marcelo Arenas Belón <[email protected]>
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    carenas authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    cbc985a View commit details
    Browse the repository at this point in the history
  2. test-tool genzeros: generate large amounts of data more efficiently

    In this developer's tests, producing one gigabyte worth of NULs in a
    busy loop that writes out individual bytes, unbuffered, took ~27sec.
    Writing chunked 256kB buffers instead only took ~0.6sec
    
    This matters because we are about to introduce a pair of test cases that
    want to be able to produce 5GB of NULs, and we cannot use `/dev/zero`
    because of the HP NonStop platform's lack of support for that device.
    
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    dscho authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    df7000c View commit details
    Browse the repository at this point in the history
  3. test-lib: add prerequisite for 64-bit platforms

    Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit
    platforms and regardless of the size of `long`.
    
    This imitates the `LONG_IS_64BIT` prerequisite.
    
    Signed-off-by: Carlo Marcelo Arenas Belón <[email protected]>
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    carenas authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    970fa57 View commit details
    Browse the repository at this point in the history
  4. t1051: introduce a smudge filter test for extremely large files

    The filter system allows for alterations to file contents when they're
    added to the database or working tree. ("Smudge" when moving to the
    working tree; "clean" when moving to the database.) This is used
    natively to handle CRLF to LF conversions. It's also employed by Git-LFS
    to replace large files from the working tree with small tracking files
    in the repo and vice versa.
    
    Git reads the entire smudged file into memory to convert it into a
    "clean" form to be used in-core. While this is inefficient, there's a
    more insidious problem on some platforms due to inconsistency between
    using unsigned long and size_t for the same type of data (size of a file
    in bytes). On most 64-bit platforms, unsigned long is 64 bits, and
    size_t is typedef'd to unsigned long. On Windows, however, unsigned long
    is only 32 bits (and therefore on 64-bit Windows, size_t is typedef'd to
    unsigned long long in order to be 64 bits).
    
    Practically speaking, this means 64-bit Windows users of Git-LFS can't
    handle files larger than 2^32 bytes. Other 64-bit platforms don't suffer
    this limitation.
    
    This commit introduces a test exposing the issue; future commits make it
    pass. The test simulates the way Git-LFS works by having a tiny file
    checked into the repository and expanding it to a huge file on checkout.
    
    Helped-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Matt Cooper <[email protected]>
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    vtbassmatt authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    b79541a View commit details
    Browse the repository at this point in the history
  5. odb: teach read_blob_entry to use size_t

    There is mixed use of size_t and unsigned long to deal with sizes in the
    codebase. Recall that Windows defines unsigned long as 32 bits even on
    64-bit platforms, meaning that converting size_t to unsigned long narrows
    the range. This mostly doesn't cause a problem since Git rarely deals
    with files larger than 2^32 bytes.
    
    But adjunct systems such as Git LFS, which use smudge/clean filters to
    keep huge files out of the repository, may have huge file contents passed
    through some of the functions in entry.c and convert.c. On Windows, this
    results in a truncated file being written to the workdir. I traced this to
    one specific use of unsigned long in write_entry (and a similar instance
    in write_pc_item_to_fd for parallel checkout). That appeared to be for
    the call to read_blob_entry, which expects a pointer to unsigned long.
    
    By altering the signature of read_blob_entry to expect a size_t,
    write_entry can be switched to use size_t internally (which all of its
    callers and most of its callees already used). To avoid touching dozens of
    additional files, read_blob_entry uses a local unsigned long to call a
    chain of functions which aren't prepared to accept size_t.
    
    Helped-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Matt Cooper <[email protected]>
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    vtbassmatt authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    e9aa762 View commit details
    Browse the repository at this point in the history
  6. git-compat-util: introduce more size_t helpers

    We will use them in the next commit.
    
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    dscho authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    e2ffeae View commit details
    Browse the repository at this point in the history
  7. odb: guard against data loss checking out a huge file

    This introduces an additional guard for platforms where `unsigned long`
    and `size_t` are not of the same size. If the size of an object in the
    database would overflow `unsigned long`, instead we now exit with an
    error.
    
    A complete fix will have to update _many_ other functions throughout the
    codebase to use `size_t` instead of `unsigned long`. It will have to be
    implemented at some stage.
    
    This commit puts in a stop-gap for the time being.
    
    Helped-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Matt Cooper <[email protected]>
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    vtbassmatt authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    d6a09e7 View commit details
    Browse the repository at this point in the history
  8. clean/smudge: allow clean filters to process extremely large files

    The filter system allows for alterations to file contents when they're
    moved between the database and the worktree. We already made sure that
    it is possible for smudge filters to produce contents that are larger
    than `unsigned long` can represent (which matters on systems where
    `unsigned long` is narrower than `size_t`, most notably 64-bit Windows).
    Now we make sure that clean filters can _consume_ contents that are
    larger than that.
    
    Note that this commit only allows clean filters' _input_ to be larger
    than can be represented by `unsigned long`.
    
    This change makes only a very minute dent into the much larger project
    to teach Git to use `size_t` instead of `unsigned long` wherever
    appropriate.
    
    Helped-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Matt Cooper <[email protected]>
    Signed-off-by: Johannes Schindelin <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>
    vtbassmatt authored and gitster committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    596b5e7 View commit details
    Browse the repository at this point in the history
Loading