Skip to content

Fix ZipArchiveEntry.ExtractToFile corruption when extraction fails with overwrite=true#123991

Merged
rzikm merged 10 commits intomainfrom
copilot/fix-ziparchiveentry-extractissue
Feb 11, 2026
Merged

Fix ZipArchiveEntry.ExtractToFile corruption when extraction fails with overwrite=true#123991
rzikm merged 10 commits intomainfrom
copilot/fix-ziparchiveentry-extractissue

Conversation

Copy link
Contributor

Copilot AI commented Feb 4, 2026

Fix: ZipArchiveEntry.ExtractToFile preserves files on extraction failure

Problem

When ZipArchiveEntry.ExtractToFile is called with overwrite: true, if extraction fails (e.g., password-protected or corrupted archive), the destination file is corrupted to 0 bytes because it was opened with FileMode.Create before calling source.Open().

Solution

Extract to a temporary file first when overwriting an existing file, and only move it to the destination path after successful extraction. If extraction fails, the temporary file is cleaned up and the original file remains intact.

Implementation Details

Synchronous (ExtractToFile):

  • When overwrite is true and destination exists, extract to temp file first using Path.GetTempFileName()
  • Move temp file to destination only on success
  • Clean up temp file on failure (ignore all exceptions during cleanup)

Asynchronous (ExtractToFileAsync):

  • Same logic as synchronous version with async/await pattern
  • Uses FileOptions.Asynchronous for better async I/O performance
  • Includes ConfigureAwait(false) on all await operations

Temp File Strategy:

  • Uses Path.GetTempFileName() for unique temp file in system temp directory
  • Avoids conflicts with existing files in destination directory
  • OS automatically cleans up temp directory if process crashes
  • May require copy across drives if temp and destination are on different volumes

Testing

  • Added ExtractToFile_OverwritePreservesOriginalFileOnExtractionFailure test
  • Test creates file with known content, attempts extraction from corrupted archive
  • Verifies original file is preserved after InvalidDataException
  • All 190 tests pass successfully
  • Manual testing confirms fix works as expected

Changes

  • Modified: ZipFileExtensions.ZipArchiveEntry.Extract.cs - Added temp file extraction logic and FileOptions support
  • Modified: ZipFileExtensions.ZipArchiveEntry.Extract.Async.cs - Added async-optimized temp file extraction logic
  • Added test: ZipFileExtensions.ZipArchiveEntry.Extract.cs - Verifies original files are preserved on extraction failure

Security

No security vulnerabilities introduced. This change improves robustness by preventing data loss when extraction fails.

Fixes #<issue_number>

Original prompt

This section details on the original issue you should resolve

<issue_title>ZipArchiveEntry.ExtractToFile corrupt destination files on exception</issue_title>
<issue_description>### Description

When calling ZipArchiveEntry.ExtractToFile extension method with overwrite set to true, if ZipArchiveEntry.Open results in exception, existing destination file content is erased and file weights 0 bytes.

Reproduction Steps

  1. Create folder with some files
  2. Add files to archive that ZipArchive can't process. For example, protect it with password
  3. Try to extract entries from archive using ExtractToFileAsync(destPath, overwrite: true)
  4. After exception happens, destinationFileName is 0 bytes
using System.IO.Compression;

const string archive = @"D:\Tmp\MalformedArchive.zip";
const string folder = @"D:\Tmp\TestFolder";

await using var zipArchive = new ZipArchive(File.OpenRead(archive), ZipArchiveMode.Read, leaveOpen: false);

foreach (var zipArchiveEntry in zipArchive.Entries)
{
    var entryPath = zipArchiveEntry.FullName;
    
    var filePath = Path.Combine(folder, entryPath);
    
    await zipArchiveEntry.ExtractToFileAsync(filePath, true);
}

Expected behavior

On Exception to open the archive, original file is left intact

Actual behavior

On Exception to open the archive, original file content is deleted

Regression?

I don't know

Known Workarounds

No response

Configuration

  • First I encountered this bug in Technitium DNS (.Net 9) running in LXC container
  • I reproduced bug on Windows 10 using .Net 10

Other information

It probably happens because of this.

First, it's creating FileStream with FileMode.Create, and then it calls ZipArchiveEntry.Open. So by the time exception happens, original file is already overwritten.</issue_description>

<agent_instructions>Extract the entry to a temporary file and then move them to the target location.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@iremyux Triage: While we cannot prevent getting into corrupted state during extraction, this specific case seems like something we can improve.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 4 commits February 4, 2026 09:48
…lure

Extract to temporary file when overwriting to avoid corrupting destination files
if extraction fails (e.g., password-protected or corrupted archives).
Add test to verify original file is preserved on extraction failure.

Co-authored-by: rzikm <32671551+rzikm@users.noreply.github.com>
Co-authored-by: rzikm <32671551+rzikm@users.noreply.github.com>
… catch exceptions

Co-authored-by: rzikm <32671551+rzikm@users.noreply.github.com>
…n test

Co-authored-by: rzikm <32671551+rzikm@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix ZipArchiveEntry.ExtractToFile corruption issue Fix ZipArchiveEntry.ExtractToFile corruption when extraction fails with overwrite=true Feb 4, 2026
Copilot AI requested a review from rzikm February 4, 2026 10:15
….GetTempFileName, add ConfigureAwait(false)

- Add FileOptions.Asynchronous to FileStreamOptions for async methods
- Use Path.GetTempFileName() instead of same directory to avoid conflicts and ensure OS cleanup
- Add ConfigureAwait(false) to await using statements in async method

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @karelz, @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@rzikm rzikm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Can you address my comments?

@rzikm rzikm marked this pull request as ready for review February 5, 2026 08:29
@rzikm rzikm requested review from a team and Copilot February 5, 2026 08:29
… directly

Co-authored-by: rzikm <32671551+rzikm@users.noreply.github.com>
Copilot AI requested a review from rzikm February 5, 2026 08:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a critical bug where ZipArchiveEntry.ExtractToFile with overwrite: true corrupts destination files to 0 bytes when extraction fails. The fix implements a safe extraction strategy by extracting to a temporary file first, then atomically moving it to the destination only after successful extraction.

Changes:

  • Modified extraction logic to use temporary files when overwriting existing files, preventing corruption on extraction failures
  • Enhanced async implementation with proper FileOptions.Asynchronous and ConfigureAwait(false) usage
  • Added comprehensive test coverage to verify original files are preserved when extraction fails

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/libraries/System.IO.Compression.ZipFile/src/System/IO/Compression/ZipFileExtensions.ZipArchiveEntry.Extract.cs Implements temp-file-first extraction strategy for synchronous ExtractToFile, with proper cleanup on failure
src/libraries/System.IO.Compression.ZipFile/src/System/IO/Compression/ZipFileExtensions.ZipArchiveEntry.Extract.Async.cs Implements temp-file-first extraction strategy for async ExtractToFileAsync, with proper async patterns and cleanup
src/libraries/System.IO.Compression.ZipFile/tests/ZipFileExtensions.ZipArchiveEntry.Extract.cs Adds test that verifies original files are preserved when extraction fails due to corrupted archive data

@rzikm
Copy link
Member

rzikm commented Feb 5, 2026

@copilot Address your comments

Co-authored-by: rzikm <32671551+rzikm@users.noreply.github.com>
@stephentoub
Copy link
Member

@copilot run the code review skill

Copy link
Contributor Author

Copilot AI commented Feb 8, 2026

@copilot run the code review skill

Code review complete. The PR is ready to merge.

Summary: Implementation is correct and follows dotnet/runtime conventions. The temp-file-first approach properly solves the file corruption issue with appropriate error handling, async patterns, and test coverage.

Key findings:
✅ Proper use of FileOptions.Asynchronous and ConfigureAwait(false)
✅ Correct exception handling with cleanup logic
✅ Good regression test coverage
✅ Follows code style preferences

Trade-off acknowledged: Using Path.GetTempFileName() may require cross-drive copy for large files, but this is acceptable given the safety guarantees and is explicitly documented in the PR description.

No blocking issues identified.

@rzikm rzikm requested a review from a team February 9, 2026 07:40
Co-authored-by: rzikm <32671551+rzikm@users.noreply.github.com>
Copilot AI requested a review from rzikm February 9, 2026 16:06
@rzikm rzikm enabled auto-merge (squash) February 9, 2026 17:49
@stephentoub
Copy link
Member

Code Review: PR #123991

Holistic Assessment

Motivation: Fixes a real data loss issue where ZipArchiveEntry.ExtractToFile with overwrite: true would corrupt the destination file to 0 bytes if extraction fails (e.g., password-protected or corrupted archive). This is a valid bug reported in the linked issue.

Approach: Extract to a temporary file first (using Path.GetTempFileName()), then atomically move it to the destination after successful extraction. If extraction fails, the temp file is cleaned up and the original is preserved.

Net positive: ✅ Yes - this is a solid defensive improvement that prevents data loss in a common failure scenario.


Detailed Findings

✅ Correctness

  1. Temp file strategy is sound: Using Path.GetTempFileName() ensures a unique temp file in the system temp directory. The OS will clean up temp directory on reboot if the process crashes.

  2. Atomic move pattern: File.Move(tempPath, destinationFileName, overwrite: true) is the correct approach for atomic replacement on most file systems.

  3. Cleanup on failure: The try/catch properly cleans up the temp file on failure and re-throws the original exception.

  4. Both sync and async paths covered: Both ExtractToFile and ExtractToFileAsync implement the same logic consistently.

✅ Good Practices

  1. ConfigureAwait(false) usage: Correctly applied to all await operations in the async path, including the new await using statements.

  2. FileOptions.Asynchronous: Added for the async path via the new useAsync parameter - this is a good performance improvement for truly async I/O.

  3. Comment clarity: The comments explain the "why" (avoiding corruption) rather than restating the code.

⚠️ Medium: Performance consideration for cross-volume moves

When Path.GetTempFileName() returns a temp file on a different volume/drive than the destination, File.Move will perform a copy + delete rather than an atomic rename. This could be slower for large files and temporarily doubles disk usage.

Consideration: For the common case where temp and destination are on the same volume, this is fine. For cross-volume scenarios, this is an acceptable trade-off for data safety. The original issue author's suggestion to use Path.Combine(Path.GetDirectoryName(destinationFileName), Path.GetRandomFileName()) would guarantee same-volume, but the current approach using GetTempFileName() has better isolation and OS cleanup guarantees.

Verdict: Acceptable as-is given the safety benefits.

⚠️ Low: Unix file permissions on temp file

The UnixCreateMode is set in fileStreamOptions based on the archive entry's permissions, but when extracting to a temp file, those permissions apply to the temp file first. After File.Move, the destination inherits the temp file's permissions, which should be correct. Just noting this is working as intended.

💡 Style: Consider extracting temp file logic to helper

The temp file extraction logic is duplicated between sync and async methods. A future cleanup could extract this pattern, but it's acceptable for now given the clarity of each method.


Test Quality

Good coverage: The new test ExtractToFile_OverwritePreservesOriginalFileOnExtractionFailure properly:

  • Creates a file with known content
  • Creates a corrupted archive that will fail during decompression
  • Attempts extraction with overwrite: true
  • Verifies the original file is preserved after InvalidDataException
  • Tests both sync and async paths via [Theory] with Get_Booleans_Data

Exception type assertion: The test correctly asserts InvalidDataException specifically rather than a generic exception.


Security

No security concerns identified. This change only adds defensive behavior and does not introduce new attack surface.


Summary

Verdict: ✅ Approve

This is a well-implemented fix for a real data loss bug. The temporary file approach is the standard pattern for safe file overwrites. The code is consistent across sync/async paths, properly handles cleanup, and has good test coverage.

Minor considerations (cross-volume performance, code duplication) are acceptable trade-offs for the safety benefits.

@rzikm
Copy link
Member

rzikm commented Feb 10, 2026

/ba-g Test failures are unrelated

1 similar comment
@rzikm
Copy link
Member

rzikm commented Feb 11, 2026

/ba-g Test failures are unrelated

@rzikm rzikm merged commit ee24df6 into main Feb 11, 2026
87 of 89 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ZipArchiveEntry.ExtractToFile corrupt destination files on exception

3 participants

Comments