As @encukou’s mentioned, a hypothetical Python implementation might not represent tuples internally as PyObject** (tagged pointers, etc.), and so a hypothetical function
PyObject ** PyTuple_Data(PyObject *)
specific to tuples might not be sufficiently future-proof solution.
Yes, that is not a bad solution, but I think we can do better. If the underlying implementation has tuples stored as contiguous PyObject* pointers, then this copy is superfluous, and it would be nice to avoid it. In a more exotic implementation, the function could still malloc a dedicated output buffer, mass-Py_INCREF pointers, etc.
So my suggestion would be a slightly reduced version of my previous proposal that would be compatible with both of these requirements:
/**
PySequence_Items()
Take a sequence 'seq' and return a pointer to an
`PyObject *` array representing its contents. The returned
object array is immutable: it may neither be changed by the
caller, nor will it reflect future changes performed to 'seq'.
The 'owner_out' parameter is used to return a Python object
owning returned memory region. It may or may not equal 'seq',
and no assumptions should be made about its type. Its only
purpose is lifetime management -- in particular, the caller
should `Py_DECREF(..)` this object once it no longer needs
access to the object array returned by this function.
The 'size_out' parameter returns the size of the returned
sequence.
*/
PyObject *const *PySequence_Items(PyObject *seq,
PyObject **owner_out,
Py_ssize_t *size_out) {
if (PyTuple_Check(seq)) {
*size_out = PyTuple_GET_SIZE(seq);
*owner_out = Py_NewRef(seq);
return ((PyTupleObject *) seq)->ob_item;
} else {
PyObject *temp = PySequence_Tuple(seq);
if (!temp) {
*owner_out = NULL;
*size_out = -1;
return NULL;
}
*owner_out = temp;
*size_out = PyTuple_GET_SIZE(temp);
return ((PyTupleObject *) temp)->ob_item;
}
}
A more exotic python implementation simply would not have the special case for tuples. It would always allocate new memory for a pointer array, mass-incref, and then return a Python capsule or similar via owner_out, which performs a mass-decref and releases the memory upon expiration of the capsule.
The caller would use this function as follows
PyObject *seq = ...;
PyObject *owner;
Py_ssize_t size;
PyObject **items = Py_Sequence_Items(seq, &owner, &size);
if (!items) { .. error handling .. }
for (Py_ssize_t i = 0; i < size; ++i) {
.. something involving items[i] ..
}
Py_DECREF(owner);
What do you think?