Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions Lib/test/test_zipimport.py
Original file line number Diff line number Diff line change
Expand Up @@ -933,6 +933,20 @@ def assertDataEntry(name):

self.doTestWithPreBuiltZip(".py", "module")

def testImportSubmodulesInZip(self):
with ZipFile(TEMP_ZIP, "w") as z:
z.mkdir("a")
z.writestr("a/__init__.py", b'')
z.mkdir("a/b")
z.mkdir("a/b/c")
z.writestr("a/b/c/__init__.py", b'def foo(): return "foo"')

importer = zipimport.zipimporter(TEMP_ZIP)
spec = importer.find_spec("a.b.c")
mod = importlib.util.module_from_spec(spec)
importer.exec_module(mod)
self.assertEqual(mod.foo(), "foo")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If here test the namespace module in zip, mkdir() must be removed, this bug is triggered by no directory entry in zip.

And don't do test via zipimport, that always fail/succeed, because find functons depend the loader's path prefix. Direct import namespace module can cover all cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused on what you want here - the test passes, why should mkdir() be removed?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this bug is triggered by no directory entry in zip. If the directory entry is already exists, everything is OK.


@support.requires_zlib()
class CompressedZipImportTestCase(UncompressedZipImportTestCase):
Expand Down
13 changes: 12 additions & 1 deletion Lib/zipimport.py
Original file line number Diff line number Diff line change
Expand Up @@ -305,7 +305,18 @@ def __repr__(self):
# Given a module name, return the potential file path in the
# archive (without extension).
def _get_module_path(self, fullname):
return self.prefix + fullname.rpartition('.')[2]
path = self.prefix + fullname.rpartition(".")[2]

# Here be dragons.
# Originally, this *only* used rpartition('.')[2] to find
# the last item in a namespace package, but that made it basically
# impossible to use nested submodules.
for suffix, _, _ in [*_zip_searchorder, (path_sep, False, False)]:
if (path + suffix) in self._get_files():
# OK, we found it the old way. Let's return it.
return path

return self.prefix + fullname.replace(".", path_sep)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As https://github.com/python/cpython/pull/121161/files#r1660062444 described, this patch has no effect, shoud patch _is_dir or _read_directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As https://github.com/python/cpython/pull/121161/files#r1660062444 described, this patch has no effect, shoud patch _is_dir or _read_directory.

From what I found, there's no problem with _read_directory, and _is_dir is only relevant to find_spec (the problem still occurs in get_code, for example). This patch seems to fix the issue for both. Have you tried this fix on this issue?

Copy link

@SeaHOH SeaHOH Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For namespace package's dealing and operation, there has already clearly explained in the source code comment. You can see them in find_loader (<=py311), find_spec and _is_dir.

And, why your test with get_code (and get_filename, get_source, is_package also) was fail? Because find functons depend the loader's path prefix, different level modules have different level loaders, skip levels will be failed. Yes, this is _get_module_path which you patched, then it broken, return the full path at once instead of stepwise lookup.

If you like to test into zipimport functions, it should be get_data:

    def testImportSubmodulesInZip(self):
        with ZipFile(TEMP_ZIP, "w") as z:
            z.writestr("a/__init__.py", b'')
            z.writestr("a/b/c/__init__.py", b'def foo(): return "foo"')

        importer = zipimport.zipimporter(TEMP_ZIP + "/a/")
        spec = importer.find_spec("a.b")
        self.assertEqual(spec.loader, None)
        self.assertEqual(spec.submodule_search_locations,
                         [(TEMP_ZIP + "/a/b").replace("/", zipimport.path_sep)])
        self.assertRaises(zipimport.ZipImportError, importer.get_code, "a.b")
        self.assertEqual(importer.get_data("a/b/"), b"")

        sys.path.insert(0, TEMP_ZIP)
        try:
            import a.b.c
            self.assertIn('namespace', str(a.b).lower())
            self.assertEqual(a.b.c.foo(), "foo")
        finally:
            del sys.path[0]
            sys.modules.pop('a.b.c', None)
            sys.modules.pop('a.b', None)
            sys.modules.pop('a', None)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @SeaHOH that the problem is rather in _read_directory(). I created #121233 for this approach.


# Does this path represent a directory?
def _is_dir(self, path):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Fixed ``zipimport.zipimporter`` improperly handling nested submodules inside
of a zip file.