Skip to content

Conversation

@osamakader
Copy link
Contributor

@osamakader osamakader commented Oct 25, 2025

When iterparse() opens a file by filename and is not explicitly closed, emit a ResourceWarning to alert developers of the resource leak.

This implements the TODO comment at line 1270 of ElementTree.py which has been requesting this feature since the close() method was added.

  • Add _closed flag to IterParseIterator to track state
  • Emit ResourceWarning in del if not closed
  • Add comprehensive test cases
  • Update existing tests to properly close iterators

@python-cla-bot
Copy link

python-cla-bot bot commented Oct 25, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Oct 25, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

When iterparse() opens a file by filename and is not explicitly closed,
emit a ResourceWarning to alert developers of the resource leak.

This implements the TODO comment at line 1270 of ElementTree.py which
has been requesting this feature since the close() method was added.

- Add _closed flag to IterParseIterator to track state
- Emit ResourceWarning in __del__ if not closed
- Add comprehensive test cases
- Update existing tests to properly close iterators

Signed-off-by: Osama Abdelkader <[email protected]>
@osamakader osamakader force-pushed the issue-140601-iterparse-warning branch from 54742a1 to bf3f888 Compare October 25, 2025 20:50
@cmaloney
Copy link
Contributor

Please don't force push to CPython PRs; it's preferred to keep commit history intact for easier review. See: https://devguide.python.org/getting-started/pull-request-lifecycle/#quick-guide

@cmaloney
Copy link
Contributor

cc: @serhiy-storchaka who added the close() ResourceWarning comment

ashm-dev

This comment was marked as duplicate.

it = iterparse(SIMPLE_XMLFILE)
del it
import gc
gc.collect() # Ensure previous iterator is cleaned up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to use support.gc_collect() instead of directly calling gc.collect() was specifically added for use in tests to ensure consistent and deterministic garbage collection behavior across platforms and configurations.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests were intentionally added to check that the resource is closed without explicit close(). Don't add close(). Just replace check_no_resource_warning() with assertWarns(ResourceWarning) and add gc_collect(). There are other tests that explicitly call close().

- Use nonlocal close_source instead of _closed flag
- Bind warnings.warn as default parameter in __del__
- Use assertWarns(ResourceWarning) instead of adding close() calls
- Use support.gc_collect() for consistent test behavior
- Remove broad exception handling

Signed-off-by: Osama Abdelkader <[email protected]>
Signed-off-by: Osama Abdelkader <[email protected]>
Signed-off-by: Osama Abdelkader <[email protected]>
F-strings can fail during cleanup. Use % formatting like
subprocess.Popen does for safer __del__ behavior.

Signed-off-by: Osama Abdelkader <[email protected]>
Check that warning message contains 'unclosed file' and the
filename to ensure it's from iterparse, not the file destructor.

Addresses review feedback from serhiy-storchaka.

Signed-off-by: Osama Abdelkader <[email protected]>
The ResourceWarning tests are already covered in test_iterparse()
as noted by serhiy-storchaka in review.

Signed-off-by: Osama Abdelkader <[email protected]>
Removing source=self prevents unraisable exception in C extension.
The warning message still contains the filename via %r formatting.

Signed-off-by: Osama Abdelkader <[email protected]>
@osamakader
Copy link
Contributor Author

Hi @serhiy-storchaka

I've addressed all the review feedback:
All tests pass locally. However, CI shows "Unraisable exception"
warnings in the C extension tests (test_xml_etree_c). The ResourceWarning
is still being emitted correctly, but it's reported as "unraisable".

Is this expected behavior for warnings in __del__ with C extensions,
or do I need to adjust the implementation?

Thanks!

@serhiy-storchaka
Copy link
Member

No, it is not an expected behavior.

I have fixed the warnings. The underlying file is closed when the iterator has been exhausted or raised an exception, so no explicit close() is needed in this case. And many tests depended on this. I have removed a ResourceWarning in such case.

Please add a versionchanged directive in the module documentation. I am not sure about the What's New document. This is a minor change, so it may not need a What's New entry.

Document that iterparse now emits ResourceWarning in Python 3.15
when not explicitly closed.

Signed-off-by: Osama Abdelkader <[email protected]>
@osamakader
Copy link
Contributor Author

Thanks @serhiy-storchaka for the support.

@serhiy-storchaka
Copy link
Member

I did not merge this PR yet, because there is an important question -- should the warning be emitted if the generator was automatically closed after iterating to the end (or after parsing error)? I silenced them, otherwise many existing tests would need an explicit close().

But now I think that the warning should be emitted even if the file was closed. Yes, this will force users to add an explicit close() after iteration (or use with closing(iterparse(...))) in the working code, but this will save them in exceptional cases -- when the iteration was interrupted and never finished.

@cmaloney
Copy link
Contributor

👍 to always emitting the warning. I think more critical in potential data loss cases (ex. there's a write buffer) but it's also worthwhile when it helps prevent hard to track down easy to resolve bugs

Comment on lines +1274 to +1277
try:
_warn(f"unclosed iterparse iterator {source.name!r}", ResourceWarning, stacklevel=2)
finally:
source.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is better?

Suggested change
try:
_warn(f"unclosed iterparse iterator {source.name!r}", ResourceWarning, stacklevel=2)
finally:
source.close()
name = getattr(source, 'name', None)
if name:
_warn("unclosed iterparse iterator %r" % (name,),
ResourceWarning, stacklevel=2)
source.close()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source.name always exist and not empty. This code is only executed when iterparse() opened a file by name.

BTW, if the iterparse() argument was file descriptor 0, your variant would not work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - I missed that close_source=True guarantees source.name exists, and if name: breaks on fd 0. Thanks for catching that!

The original code is correct. (Though % formatting might be slightly safer than f-string in __del__, but not a blocker.)

nonlocal close_source
if close_source:
source.close()
close_source = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this option be better here?

try:
  if close_source:
    close_source = False
    source.close()
finally:
  gen.close()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder. It seems that only exception can be raised here is a KeyboardInterrupt. The difference between calling and not calling gen.close() is whether you can continue iterating after iterparse().close() (cached elements can still be yielded). But if iterparse().close() failed, you do not have any guarantees. Anyway, the file is closed first, so there will not be leaks. I think there will be no practical difference, so we can keep the simplest code.


# Not exhausting the iterator still closes the resource (bpo-43292)
with warnings_helper.check_no_resource_warning(self):
def test_iterparse_not_close(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to split one huge test into several smaller ones, since this test has already become too big?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. There are 4 different cases here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other big tests here, so it is worth to refactor them before merging this PR. See #141499.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

@serhiy-storchaka serhiy-storchaka merged commit a486d45 into python:main Nov 13, 2025
46 checks passed
CuriousLearner added a commit to CuriousLearner/cpython that referenced this pull request Nov 13, 2025
* main: (463 commits)
  pythongh-140601: Add ResourceWarning to iterparse when not closed (pythonGH-140603)
  pythongh-137969: Fix double evaluation of `ForwardRef`s which rely on globals (python#140974)
  pythongh-139109: A new tracing JIT compiler frontend for CPython (pythonGH-140310)
  pythongh-141004: Document `PyErr_RangedSyntaxLocationObject` (python#141521)
  pythongh-140873: Add support of non-descriptor callables in functools.singledispatchmethod() (pythonGH-140884)
  pythongh-139653: Add PyUnstable_ThreadState_SetStackProtection() (python#139668)
  pythongh-141004: Document `PyCode_Optimize` (pythonGH-141378)
  pythongh-141004: Document C APIs for dictionary keys, values, and items (pythonGH-141009)
  pythongh-137959: Fix `TIER1_TO_TIER2` macro name in JIT InternalDocs (pythonGH-141496)
  pythongh-139871: Add `bytearray.take_bytes([n])` to efficiently extract `bytes` (pythonGH-140128)
  pythongh-140601: Refactor ElementTree.iterparse() tests (pythonGH-141499)
  pythongh-135801: Add the module parameter to compile() etc (pythonGH-139652)
  pythongh-140260: fix data race in `_struct` module initialization with subinterpreters (python#140909)
  pythongh-137109: refactor warning about threads when forking (python#141438)
  pythongh-141004: Document `PyRun_InteractiveOneObject` (pythonGH-141405)
  pythongh-124111: Fix TCL 9 thread detection (pythonGH-128103)
  pythongh-141442: Add escaping to iOS testbed arguments (python#141443)
  pythongh-140936: Fix JIT assertion crash at finalization if some generator is alive (pythonGH-140969)
  Add details about JIT build infrastructure and updating dependencies to `Tools/jit` (python#141167)
  pythongh-141412: Use reliable target URL for urllib example (pythonGH-141428)
  ...
CuriousLearner added a commit to CuriousLearner/cpython that referenced this pull request Nov 14, 2025
* 'main' of github.com:python/cpython: (464 commits)
  pythongh-140601: Add ResourceWarning to iterparse when not closed (pythonGH-140603)
  pythongh-137969: Fix double evaluation of `ForwardRef`s which rely on globals (python#140974)
  pythongh-139109: A new tracing JIT compiler frontend for CPython (pythonGH-140310)
  pythongh-141004: Document `PyErr_RangedSyntaxLocationObject` (python#141521)
  pythongh-140873: Add support of non-descriptor callables in functools.singledispatchmethod() (pythonGH-140884)
  pythongh-139653: Add PyUnstable_ThreadState_SetStackProtection() (python#139668)
  pythongh-141004: Document `PyCode_Optimize` (pythonGH-141378)
  pythongh-141004: Document C APIs for dictionary keys, values, and items (pythonGH-141009)
  pythongh-137959: Fix `TIER1_TO_TIER2` macro name in JIT InternalDocs (pythonGH-141496)
  pythongh-139871: Add `bytearray.take_bytes([n])` to efficiently extract `bytes` (pythonGH-140128)
  pythongh-140601: Refactor ElementTree.iterparse() tests (pythonGH-141499)
  pythongh-135801: Add the module parameter to compile() etc (pythonGH-139652)
  pythongh-140260: fix data race in `_struct` module initialization with subinterpreters (python#140909)
  pythongh-137109: refactor warning about threads when forking (python#141438)
  pythongh-141004: Document `PyRun_InteractiveOneObject` (pythonGH-141405)
  pythongh-124111: Fix TCL 9 thread detection (pythonGH-128103)
  pythongh-141442: Add escaping to iOS testbed arguments (python#141443)
  pythongh-140936: Fix JIT assertion crash at finalization if some generator is alive (pythonGH-140969)
  Add details about JIT build infrastructure and updating dependencies to `Tools/jit` (python#141167)
  pythongh-141412: Use reliable target URL for urllib example (pythonGH-141428)
  ...
CuriousLearner added a commit to CuriousLearner/cpython that referenced this pull request Nov 14, 2025
* 'main' of github.com:python/cpython: (464 commits)
  pythongh-140601: Add ResourceWarning to iterparse when not closed (pythonGH-140603)
  pythongh-137969: Fix double evaluation of `ForwardRef`s which rely on globals (python#140974)
  pythongh-139109: A new tracing JIT compiler frontend for CPython (pythonGH-140310)
  pythongh-141004: Document `PyErr_RangedSyntaxLocationObject` (python#141521)
  pythongh-140873: Add support of non-descriptor callables in functools.singledispatchmethod() (pythonGH-140884)
  pythongh-139653: Add PyUnstable_ThreadState_SetStackProtection() (python#139668)
  pythongh-141004: Document `PyCode_Optimize` (pythonGH-141378)
  pythongh-141004: Document C APIs for dictionary keys, values, and items (pythonGH-141009)
  pythongh-137959: Fix `TIER1_TO_TIER2` macro name in JIT InternalDocs (pythonGH-141496)
  pythongh-139871: Add `bytearray.take_bytes([n])` to efficiently extract `bytes` (pythonGH-140128)
  pythongh-140601: Refactor ElementTree.iterparse() tests (pythonGH-141499)
  pythongh-135801: Add the module parameter to compile() etc (pythonGH-139652)
  pythongh-140260: fix data race in `_struct` module initialization with subinterpreters (python#140909)
  pythongh-137109: refactor warning about threads when forking (python#141438)
  pythongh-141004: Document `PyRun_InteractiveOneObject` (pythonGH-141405)
  pythongh-124111: Fix TCL 9 thread detection (pythonGH-128103)
  pythongh-141442: Add escaping to iOS testbed arguments (python#141443)
  pythongh-140936: Fix JIT assertion crash at finalization if some generator is alive (pythonGH-140969)
  Add details about JIT build infrastructure and updating dependencies to `Tools/jit` (python#141167)
  pythongh-141412: Use reliable target URL for urllib example (pythonGH-141428)
  ...
StanFromIreland pushed a commit to StanFromIreland/cpython that referenced this pull request Dec 6, 2025
…thonGH-140603)

When iterparse() opens a file by filename and is not explicitly closed,
emit a ResourceWarning to alert developers of the resource leak.

Signed-off-by: Osama Abdelkader <[email protected]>
Co-authored-by: Serhiy Storchaka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants