I recently encountered a zip file that could be recursively extracted infinitely, which got me intrigued in how one could achieve something like that. I thus dedicated my time this hackathon to learning about data compression and quines, in hopes I could create a compressed archive file (i.e. .zip, .tar.gz).
I learnt about the use of:
- Lemepl-Ziv algorithms
- Huffman coding
- DEFLATE compression specification
- Checksums
- Quirks in file binary data (little-endian, headers and footers, etc.)
All of these account for the necessary items to understanding how popular data archives are stored and made.
In the end, I was nearly there, but the problem of having recursive checksums meant I needed to do some fancy algebra to find a CRC value that would equate appropriately in both the header and data section of an archive file. And so I ran out of time and couldn't get it working completely :(
I am however very happy with the progress I made and will attempt to create a streamlined solution, like a Python package that adds a layer of abstraction to the "recursive archive problem" that can be then applied to the specifics of the provided by file types like zip and rar.
Built With
- deflate
Log in or sign up for Devpost to join the conversation.