Skip to content

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented Mar 21, 2019

New "object debugger" which checks frequently if all Python object tracked
by the garbage collector are consistent: gc.enable_object_debugger()
and gc.disable_object_debugger().

  • Add new _PyObject_CheckConsistency() function
  • _PyUnicode_CheckConsistency() and _PyDict_CheckConsistency() are
    now exposed in the internal API. _PyDict_CheckConsistency()
    parameter type becomes PyObject*.
  • Add more checks to _PyType_CheckConsistency().

https://bugs.python.org/issue36389

@vstinner
Copy link
Member Author

Thanks @csabella, I applied your suggestions. I rebased my PR to edit fix the commit message.

@vstinner
Copy link
Member Author

Second version of my change.

The object debugger now longer calls _PyXXX_CheckConsistency() functions, but rather reuse tp_traverse mecanism and only implements the most consistency checks:

static void
gc_check_object_consistency(PyObject *op)
{
#define ASSERT(expr) _PyObject_ASSERT(op, (expr))

    ASSERT(op != NULL);
    ASSERT(!_PyObject_IsFreed(op));
    ASSERT(Py_REFCNT(op) >= 1);

    PyTypeObject *type = op->ob_type;
    ASSERT(type != NULL);
    ASSERT(!_PyObject_IsFreed((PyObject *)type));

#undef ASSERT
}

The documentation now explains which checks are implemented and says explicitly that the debugger is written to find bugs in C extensions. It now also mentions that the debugger rely on the debug hooks on memory allocators and shortly explains how to enable them.

@vstinner
Copy link
Member Author

Small update: rebase on top of new better _PyObject_IsFreed() implementation, and fix where gc_check_object_debugger() is called in Modules/gcmodule.c.

New "object debugger" which checks frequently if all Python objects tracked
by the garbage collector are consistent: gc.enable_object_debugger()
and gc.disable_object_debugger().

* Py_FatalError() and _PyObject_AssertFailed() now disable the GC
  object debugger to prevent reentrant calls.
* Fix _PyObject_Dump() for ob_type=NULL
Disable debugger in PyObject_GC_Del() and _PyObject_GC_Resize(): the
debugger is just too slow.
@vstinner
Copy link
Member Author

vstinner commented Apr 15, 2019

New rebase. I added multiple thresholds to get to reduce the performance overhead.

I compared the number of _PyGC_ObjectDebuggerGeneration(0) calls using gc.enable_object_debugger(5, 10, 10) vs collect(0) calls using gc.set_threshold(5, 10, 10): _PyGC_ObjectDebuggerGeneration(0) is called 10x more times.

My raw stats:

    gc.set_threshold(5)
    # STATS: collect(0) called: 62221
    # STATS: collect(1) called: 5653
    # STATS: collect(2) called: 70

vs

    gc.enable_object_debugger(5)
    # STATS: _PyGC_ObjectDebuggerGeneration(0) called: 387352
    # STATS: _PyGC_ObjectDebuggerGeneration(1) called: 38735
    # STATS: _PyGC_ObjectDebuggerGeneration(2) called: 3683

Ah, and _PyGC_ObjectDebuggerGeneration(2) is called 52x more than than collect(2)...

{
struct _gc_object_debugger *debugger = &_PyRuntime.gc.object_debugger;
debugger->enabled = 0;
for (int i=0; i < NUM_GENERATIONS; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int i=0; i < NUM_GENERATIONS; i++) {
for (int i = 0; i < NUM_GENERATIONS; i++) {

GC_OBJECT_ASSERT(op, type != NULL);
GC_OBJECT_ASSERT(op, !_PyObject_IsFreed((PyObject *)type));

#undef ASSERT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is ASSERT?



/*[clinic input]
gc.enable_object_debugger as gc_py_enable_object_debugger -> NoneType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not use -> NoneType. It does not add clarity. Actually return Py_None without increfing looks suspicious. It is clearer to use Py_RETURN_NONE.


This debugger aims to debug bugs in C extensions.

.. function:: enable_object_debugger(threshold)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature does not match implementation.

int threshold; /* collection threshold */
int count; /* count of allocations or collections of younger
generations */
} generations[NUM_GENERATIONS];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have not found where multiple generations are used?

@vstinner
Copy link
Member Author

vstinner commented Jul 8, 2019

Abandoned for reasons explained at: https://bugs.python.org/issue36389#msg347497

@vstinner vstinner closed this Jul 8, 2019
@vstinner vstinner deleted the gc_object_debugger branch July 8, 2019 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants