bpo-33597: Reduce PyGC_Head size#7043
Conversation
Include/objimpl.h
Outdated
There was a problem hiding this comment.
Seems _PyGC_generation0->gc.gc_prev can have the lowest bits set. This will trigger an assertion in _PyGCHead_SET_PREV().
There was a problem hiding this comment.
Only Python objects use lower bits. Link header (PyGC_Head without Python object) must not have flags.
(Surely, I need to add more comments.)
Include/objimpl.h
Outdated
There was a problem hiding this comment.
_PyGCHead_PREV(g) is called twice. Wouldn't be better to cache the result?
Modules/gcmodule.c
Outdated
There was a problem hiding this comment.
Since gc->gc.gc_next->gc.gc_prev is not real prev pointer, we can't call _PyTuple_MaybeUntrack(op) here.
I moved it to split function. But it means this pull request adds one more link traversal.
By the way, current code is not idiomatic. While most tuples can be untracked, there are still many tuples
which arn't be untracked. For example, __mro__ or __bases__.
In final generation, we check all tuples contents. In applications which triggers final generation GC frequently,
and there are many tuples, it can be significant overhead.
Anyway, I need to find "tracked tuple heavy" application to benchmark before optimize here.
There was a problem hiding this comment.
If final generation GC is triggered frequently, it means our heuristic is inadequate. Full collections can be very expensive.
|
|
||
| #if GC_DEBUG | ||
| static void | ||
| validate_list(PyGC_Head *head, uintptr_t expected_mask) |
There was a problem hiding this comment.
@pitrou @serhiy-storchaka
How do you think this function? Should I remove these checks and GC_DEBUG flag?
This validation was useful while hacking, and I think it's good document about when MASK flags are set and cleared.
But I can replace them to just a line of comment, of course.
There was a problem hiding this comment.
I'm ok with this function. Have you tried to always enable it in debug mode, or is it too slow?
There was a problem hiding this comment.
I enabled it while I'm debugging. It made test significantly slower.
vstinner
left a comment
There was a problem hiding this comment.
LGTM. The performance seems acceptable, and reducing the memory footprint can only be a good idea :-)
Include/objimpl.h
Outdated
| if (g->gc.gc_next != NULL) \ | ||
| Py_FatalError("GC object already tracked"); \ | ||
| _PyGCHead_SET_REFS(g, _PyGC_REFS_REACHABLE); \ | ||
| assert((g->gc.gc_prev & 6) == 0); \ |
Include/objimpl.h
Outdated
| g->gc.gc_next->gc.gc_prev = g->gc.gc_prev; \ | ||
| PyGC_Head *prev = _PyGCHead_PREV(g); \ | ||
| assert(g->gc.gc_next != NULL); \ | ||
| prev->gc.gc_next = g->gc.gc_next; \ |
There was a problem hiding this comment.
Hmm... I wonder if the look would less weird if we also had _PyGCHead_NEXT and _PyGCHead_SET_NEXT macros.
There was a problem hiding this comment.
I skip using _PyGCHead_PREV and _PyGCHead_SET_PREV when flags must not be set. (i.e. list header) too.
Include/objimpl.h
Outdated
| prev->gc.gc_next = g->gc.gc_next; \ | ||
| _PyGCHead_SET_PREV(g->gc.gc_next, prev); \ | ||
| g->gc.gc_next = NULL; \ | ||
| g->gc.gc_prev &= _PyGC_PREV_MASK_FINALIZED; \ |
There was a problem hiding this comment.
Is that the same as _PyGCHead_SET_PREV(g, NULL)? The latter seems clearer to me.
There was a problem hiding this comment.
Bit 1 must be kept after untracking.
But bit 2 and 3 must be cleared when untracking, because this macro may be called while GC
(i.e. weakref callback or tp_finalize).
There was a problem hiding this comment.
Why not adding a macro setting the pointer and flags at once? _PyGCHead_SET_PREV_FLAGS(g->gc.gc_next, prev, flags).
There was a problem hiding this comment.
Because it's not useful. gc_prev is used as:
- Set prev pointer, without touching any flags. (_PyTrash_deposit_object and _PyObject_GC_TRACK)
- Set prev pointer to list header node. Flags are not used in list header. (_PyObject_GC_TRACK)
- Clear prev pointer and flags, but keep only
_PyGC_PREV_MASK_FINALIZED. (only here.) - Clear gc_prev entirely.
= (uintptr_t)0.
Include/objimpl.h
Outdated
| union _gc_head *gc_prev; | ||
| Py_ssize_t gc_refs; | ||
| union _gc_head *gc_next; // NULL means the object is not tracked | ||
| uintptr_t gc_prev; |
There was a problem hiding this comment.
Can you add a small comment here explaining this field?
|
When you're done making the requested changes, leave the comment: |
I found Now I think keeping two implementation is not worth enough. |
It's not only about Windows, other OSes can do it too (perhaps Linux does?). |
|
You're right... Linux can use 3GB for user virtual address. And it's default for x86. Then, (a) Use int64_t instead of intpr_t, or (b) keep two implementation... I think I should try (b). |
|
I found UNREACHABLE flag is only needed in move_unreachable() essentially. Now we require 4bytes aligned pointer, and support 1G-1 refcnt on 32bit (4bytes for each pointer) platform. |
Include/objimpl.h
Outdated
| union _gc_head *gc_prev; | ||
| Py_ssize_t gc_refs; | ||
| union _gc_head *gc_next; // NULL means the object is not tracked | ||
| uintptr_t gc_prev; // Pointer to previous object in the list. |
There was a problem hiding this comment.
I changed member name to _gc_prev and _gc_next.
I don't want 3rd party touches them directly.
Include/objimpl.h
Outdated
| uintptr_t gc_prev; // Pointer to previous object in the list. | ||
| // Lowest three bits are used for flags. | ||
| } gc; | ||
| double dummy; /* force worst-case alignment */ |
There was a problem hiding this comment.
I removed dummy, and stop using union too.
| #define _PyGCHead_SET_PREV(g, p) do { \ | ||
| assert(((uintptr_t)p & ~_PyGC_PREV_MASK) == 0); \ | ||
| (g)->gc.gc_prev = ((g)->gc.gc_prev & ~_PyGC_PREV_MASK) \ | ||
| | ((uintptr_t)(p)); \ |
There was a problem hiding this comment.
I don't think it's worth enough...
Include/objimpl.h
Outdated
| #define _PyObject_GC_TRACK(o) do { \ | ||
| PyGC_Head *g = _Py_AS_GC(o); \ | ||
| if (_PyGCHead_REFS(g) != _PyGC_REFS_UNTRACKED) \ | ||
| if (g->gc.gc_next != NULL) \ |
Modules/gcmodule.c
Outdated
| } | ||
|
|
||
| static inline void | ||
| gc_clear_masks(PyGC_Head *g) |
There was a problem hiding this comment.
Since I removed MASK_TENTATIVELY_UNREACHABLE from gc_prev, I renamed this to gc_clear_collecting.
Modules/gcmodule.c
Outdated
| } | ||
|
|
||
| static inline void | ||
| gc_set_refs(PyGC_Head *g, Py_ssize_t v) |
Modules/gcmodule.c
Outdated
| } | ||
|
|
||
| static inline void | ||
| gc_reset_refs(PyGC_Head *g, Py_ssize_t v) |
Modules/gcmodule.c
Outdated
| (visitproc)visit_reachable, | ||
| (void *)young); | ||
| gc_set_prev(gc, prev); | ||
| gc->gc.gc_prev &= ~MASK_COLLECTING; |
Modules/gcmodule.c
Outdated
| } | ||
| gc = next; | ||
| } | ||
| young->gc.gc_prev = (uintptr_t)prev; |
| if (gc_get_refs(gc) != 0) { | ||
| ret = -1; | ||
| } | ||
| _PyGCHead_SET_PREV(gc, prev); |
There was a problem hiding this comment.
doubly-linked list is broken and restored in this function. So no need to add it in function description.
+ // Use gc_refs and break gc_prev again. comment is above in this function.
Include/objimpl.h
Outdated
| #define _PyGCHead_NEXT(g) ((PyGC_Head*)(g)->_gc_next) | ||
| #define _PyGCHead_SET_NEXT(g, p) ((g)->_gc_next = (uintptr_t)(p)) | ||
|
|
||
| // Lowest two bits of _gc_prev is used for flags described below. |
Objects/object.c
Outdated
|
|
||
| _PyRuntime.gc.trash_delete_later = | ||
| (PyObject*) _Py_AS_GC(op)->gc.gc_prev; | ||
| _PyRuntime.gc.trash_delete_later = (PyObject*) _Py_AS_GC(op)->_gc_prev; |
Objects/object.c
Outdated
|
|
||
| tstate->trash_delete_later = | ||
| (PyObject*) _Py_AS_GC(op)->gc.gc_prev; | ||
| tstate->trash_delete_later = (PyObject*) _Py_AS_GC(op)->_gc_prev; |
Modules/gcmodule.c
Outdated
| #define GC_NEXT _PyGCHead_NEXT | ||
| #define GC_PREV _PyGCHead_PREV | ||
|
|
||
| // Bit 0 of _gc_prev is used for _PyGC_PREV_MASK_FINALIZED in objimpl.h |
There was a problem hiding this comment.
The comment doesn't seem to match the line being commented.
Modules/gcmodule.c
Outdated
| // move_legacy_finalizers() removes this flag instead. | ||
| // Between them, unreachable list is not normal list and we can not use | ||
| // most gc_list_* functions for it. We should manually tweaking unreachable | ||
| // list in these two functions. |
There was a problem hiding this comment.
I don't understand "We should manually tweaking unreachable list in these two functions".
There was a problem hiding this comment.
Removed last sentence.
I meant we should manually set _gc_prev and _gc_next, like this:
+ // Manually unlink gc from unreachable list because
+ PyGC_Head *prev = GC_PREV(gc);
+ PyGC_Head *next = (PyGC_Head*)(gc->_gc_next & ~NEXT_MASK_UNREACHABLE);
+ assert(prev->_gc_next & NEXT_MASK_UNREACHABLE);
+ assert(next->_gc_next & NEXT_MASK_UNREACHABLE);
+ prev->_gc_next = gc->_gc_next; // copy NEXT_MASK_UNREACHABLE
|
If there are no more objections, I want to merge this in next week. |
https://bugs.python.org/issue33597