bpo-25711: Rewrite zipimport in pure Python.#6809

serhiy-storchaka · 2018-05-14T16:20:08Z

https://bugs.python.org/issue25711

brettcannon · 2018-08-17T17:49:31Z

Lib/zipimport.py

+
+        path = _get_module_path(self, fullname)
+        if mi:
+            fullpath = path + path_sep + '__init__.py'


bootstrap_external._path_join()

What is the benefit of using bootstrap_external._path_join()?

Readability. When I read path + path_sep + '__init__.py' I have to mentally piece together that this is constructing a file path, while _path_join(path, '__init__.py)` gives me that context implicitly.

brettcannon · 2018-08-17T17:51:45Z

Lib/zipimport.py

+        if mi:
+            fullpath = path + path_sep + '__init__.py'
+        else:
+            fullpath = path + '.py'


Might as well use an f-string.

What is the benefit of using an f-string instead of + for concatenating two strings?

I find it more readable.

brettcannon · 2018-08-17T17:54:17Z

Lib/zipimport.py

+                # add __path__ to the module *before* the code gets
+                # executed
+                path = _get_module_path(self, fullname)
+                fullpath = f'{self.archive}{path_sep}{path}'


_bootstrap_external._path_join()

brettcannon · 2018-08-17T17:55:41Z

Lib/zipimport.py

+
+# implementation
+
+def _unpack_uint32(data):


_bootstrap_external._r_long()

What to do with _unpack_uint16? _unpack_uint32 looks to me more descriptive than _r_long.

You can keep _unpack_uint16, I just don't think we need to have two implementations to do the exact same thing of reading in the bytes of a little-endian 4 bytes int. If you don't like the name then feel free to rename it in importlib.

brettcannon · 2018-08-17T17:56:37Z

Lib/zipimport.py

+                    name = name.decode('latin1').translate(cp437_table)
+
+            name = name.replace('/', path_sep)
+            path = f'{archive}{path_sep}{name}'


bootstrap_external._path_join()

Lib/zipimport.py

berkerpeksag · 2018-08-19T11:16:44Z

Lib/zipimport.py

+
+def _unpack_uint16(data):
+    """Convert 2 bytes in little-endian to an integer."""
+    assert len(data) == 2


Nit: Wouldn't be better if we raised ValueError here and in _unpack_uint32()?

In all cases the length of the data is checked before calling _unpack_uint*(). The assertion condition is always true.

warsaw

Comments just based on reading the code. I'm going to test this branch with importlib_resources (standalone) next.

Lib/test/test_bdb.py

Lib/importlib/_bootstrap_external.py

Lib/test/test_zipimport.py

warsaw · 2018-09-11T01:02:50Z

Lib/zipimport.py

@@ -0,0 +1,640 @@
+'''zipimport provides support for importing Python modules from Zip archives.


Style nit: this should be a """ string.

Lib/zipimport.py

warsaw · 2018-09-11T17:04:06Z

Lib/zipimport.py

+        try:
+            raw_data = fp.read(data_size)
+        except OSError:
+            raise OSError("zipimport: can't read data")


Is this just here to change the error message? I guess that's for compatibility with the message raised by the C implementation? I'm not sure how useful that is, and it does kind of hide the original exception (which might be useful for debugging). My inclination is to preserve the original OSError rather than produce one with a different message, but if you feel otherwise, why not use a raise from here too?

Removed the try block.

warsaw · 2018-09-11T17:04:23Z

Lib/zipimport.py

+    try:
+        decompress = _get_decompress_func()
+    except:
+        raise ZipImportError("can't decompress data; zlib not available")


except Exception and raise from

Added except Exception.

Lib/zipimport.py

warsaw

The branch does work with importlib_resources AFAICT.

bedevere-bot · 2018-09-11T17:51:12Z

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

serhiy-storchaka

Thank you for your review @warsaw. I have addressed most of your comments. But AFAIK from error is not needed in the Python code. Exceptions is chained by default. You need to add from None for suppressing chaining.

Lib/importlib/_bootstrap_external.py

Lib/test/test_bdb.py

Lib/test/test_zipimport.py

serhiy-storchaka · 2018-09-11T20:36:08Z

Lib/zipimport.py

+    'PQRSTUVWXYZ[\\]^_'
+    '`abcdefghijklmno'
+    'pqrstuvwxyz{|}~\x7f'
+


Just for readability. The above part is ASCII, and 16 chars per row, the below part is non-ASCII, and 8 chars per row. What comment should I add?

serhiy-storchaka · 2018-09-11T20:36:46Z

Lib/zipimport.py

+        from zlib import decompress
+    except:
+        _bootstrap._verbose_message('zipimport: zlib UNAVAILABLE')
+        raise ZipImportError("can't decompress data; zlib not available")


The C implementation catches all exceptions. I agree, that this is evil and will change this.

serhiy-storchaka · 2018-09-11T20:39:47Z

Lib/zipimport.py

+        try:
+            raw_data = fp.read(data_size)
+        except OSError:
+            raise OSError("zipimport: can't read data")


Removed the try block.

serhiy-storchaka · 2018-09-11T20:43:12Z

Lib/zipimport.py

+    try:
+        decompress = _get_decompress_func()
+    except:
+        raise ZipImportError("can't decompress data; zlib not available")


Added except Exception.

Lib/zipimport.py

serhiy-storchaka

I prefer to make behavior changing clean up and remove unneeded tests in separate PR.

Lib/zipimport.py

serhiy-storchaka · 2018-09-17T11:15:17Z

Lib/zipimport.py

+        try:
+            toc_entry = self._files[key]
+        except KeyError:
+            raise OSError(0, '', key)


For raising a proper FileNotFoundError you need to specify the corresponding errno.

import errno raise FileNotFoundError(errno.ENOENT, 'No such file or directory', key)

(and maybe using pathname is more proper than key).

I prefer to defer this change to later. It is nontrivial and may need additional discussion and tests.

warsaw

I think we're down to one issue/question for this branch, a merge conflict, and deferring cleanups to a follow on branch.

warsaw

Thanks for doing this - I think it will lead to great improvements in zipimport in the future. My only suggestion would be to open a bpo issue for the subsequent clean up branch, since there are things that we know can be improved on the next pass. I think this is a good change to land.

serhiy-storchaka · 2018-09-18T20:26:01Z

Thank you for your review @warsaw. I have created two following PRs (#9404 and #9406) and one issue (bpo-34726). Will open more. I still don't know what to do with exceptions.

serhiy-storchaka added type-feature A feature request or enhancement DO-NOT-MERGE labels May 14, 2018

serhiy-storchaka requested a review from a team May 14, 2018 16:20

the-knights-who-say-ni added the CLA signed label May 14, 2018

bedevere-bot added the awaiting merge label May 14, 2018

gpshead requested review from Yhg1s and brettcannon May 14, 2018 16:26

serhiy-storchaka and others added 2 commits July 8, 2018 10:33

bpo-25711: Rewrite zipimport in pure Python.

c090211

typo

fb93cdd

serhiy-storchaka force-pushed the zipimport branch from fcf8f37 to fb93cdd Compare July 8, 2018 07:38

serhiy-storchaka added 4 commits July 8, 2018 13:08

Update Python/importlib_zipimport.h.

561e14a

Fix importlib.resources.

8d3e66d

Fix tests.

c8a2cde

Remove zipimport.c from VC project.

0265548

serhiy-storchaka requested a review from a team as a code owner July 8, 2018 17:21

Remove references to PyInit_zipimport().

9269d84

brettcannon reviewed Aug 17, 2018

View reviewed changes

brettcannon requested a review from warsaw August 17, 2018 18:03

serhiy-storchaka added 3 commits August 17, 2018 22:29

Merge branch 'master' into zipimport

d34d754

Try to fix a build on Windows.

a683768

Fix tests on Windows.

3928725

berkerpeksag reviewed Aug 19, 2018

View reviewed changes

serhiy-storchaka added 2 commits August 25, 2018 09:21

Merge branch 'master' into zipimport

3ea8986

Address Brett's comments.

6a78b4c

brettcannon approved these changes Aug 31, 2018

View reviewed changes

warsaw reviewed Sep 11, 2018

View reviewed changes

warsaw requested changes Sep 11, 2018

View reviewed changes

bedevere-bot added awaiting changes and removed awaiting merge labels Sep 11, 2018

Merge branch 'master' into zipimport

14c8601

python deleted a comment from warsaw Sep 11, 2018

Addressed commit reviews.

5fed013

serhiy-storchaka commented Sep 11, 2018

View reviewed changes

serhiy-storchaka added 2 commits September 17, 2018 13:53

Merge branch 'master' into zipimport

3984d25

Add additional comments for the cp437 table.

33fa5a4

serhiy-storchaka commented Sep 17, 2018

View reviewed changes

warsaw requested changes Sep 17, 2018

View reviewed changes

Merge branch 'master' into zipimport

ec110a0

warsaw approved these changes Sep 18, 2018

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting changes labels Sep 18, 2018

serhiy-storchaka merged commit 79d1c2e into python:master Sep 18, 2018

bedevere-bot removed the awaiting merge label Sep 18, 2018

serhiy-storchaka deleted the zipimport branch September 18, 2018 19:22

tirkarthi mentioned this pull request May 14, 2019

Reference zipimport source code from docs #13310

Merged

marcelotduarte mentioned this pull request Mar 1, 2020

correctly set __package__ in a module in a zip file marcelotduarte/cx_Freeze#609

Closed

		@@ -0,0 +1,640 @@
		'''zipimport provides support for importing Python modules from Zip archives.

Uh oh!

Conversation

serhiy-storchaka commented May 14, 2018 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brettcannon Aug 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

warsaw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

warsaw left a comment

Choose a reason for hiding this comment

Uh oh!

bedevere-bot commented Sep 11, 2018

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

serhiy-storchaka commented May 14, 2018 •

edited by bedevere-bot

Loading

brettcannon Aug 24, 2018 •

edited

Loading

serhiy-storchaka commented Sep 18, 2018 •

edited by bedevere-bot

Loading