Skip to content

Conversation

@carljm
Copy link
Member

@carljm carljm commented May 19, 2018

@carljm
Copy link
Member Author

carljm commented May 19, 2018

(Failing CI builds are asyncio flakiness, not related.)

Lib/functools.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When cls is None? When __get__ is called without cls parameter?
Btw in docs for descriptors, the third argument is called owner IIRC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK every call to __get__ generated by attribute lookup machinery will include the second (not including self) argument, but both the descriptor documentation (https://docs.python.org/3/reference/datamodel.html#descriptors) and the descriptor howto guide (https://docs.python.org/3/howto/descriptor.html) explicitly include an example of manually calling a descriptor's __get__ method providing only one argument. Also, all of the code examples in the HOWTO show the second argument as optional. Given that the method can be called directly and cls is unused in this implementation, it seems reasonable to follow that lead and make it optional.

Regarding the argument name, considering both the Data Model documentation and the HOWTO, we see examples in Python's own documentation of descriptors whose __get__ second argument is named owner, type, or objtype. All three of these are found in the standard library, along with cls, klass, ownerclass, and _type. There doesn't seem to be a standard usage. Personally I find cls or klass or objtype clearer, as owner doesn't clarify the key difference from the other argument, which is that it's a type. Perhaps naming the two arguments obj and objtype does the best job of clearly indicating their relationship. That said, I don't feel strongly and will rename to owner if you prefer that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt if howto is 100% correct.

datamodel doesn't allow None for owner, AFAIK Python aways pass a class type.

Using default for always specified argument confuses me.

datamodel uses owner, please do it as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to change recommended argument name -- let's discuss it in another issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed signature of __get__ to (self, instance, owner) to match Data Model documentation.

Lib/functools.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__getitem__ / setdefault are atomic operations.
The code could be modified in the following way:

  1. Get attribute value without locking.
  2. If cached attr is not found -- acquire the lock, get the value again under the lock and calculate it if not present.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update. Thanks for the review!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@carljm
Copy link
Member Author

carljm commented May 20, 2018

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@asvetlov: please review the changes made to this pull request.

@carljm
Copy link
Member Author

carljm commented May 20, 2018

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@asvetlov: please review the changes made to this pull request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the method is never called

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s expected. The point of this class is just to test the error message; cached property on classes with slots is not supported. Would you find it clearer if I used pass as the body? I just wrote it the way someone realistically might, to make it clear that the error is not due to a mistake in the implementation of the method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise RuntimeError('never executed') is even more clear.

@carljm
Copy link
Member Author

carljm commented May 22, 2018

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@asvetlov: please review the changes made to this pull request.

Copy link
Contributor

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Would be nice to get confirmation from code owners, @rhettinger and @ncoghlan

@asvetlov
Copy link
Contributor

Ooop. @carljm please resolve merging conflicts.

@carljm carljm force-pushed the cached-property branch from 3e53053 to f77ff04 Compare May 30, 2018 15:38
Copy link
Contributor

@sir-sigurd sir-sigurd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work properly in this case:

In [75]: class A:
    ...:     def f(self, x=1):
    ...:         return x
    ...: 
    ...:     cached_f = cached_property(f)

In [76]: a = A()

In [77]: a.f(2)
Out[77]: 2

In [78]: a.cached_f
Out[78]: 1

In [79]: a.f(2)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-79-3226073480ca> in <module>()
----> 1 a.f(2)

TypeError: 'int' object is not callable

So __set_name__() should be use to determine attrname.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? It seems to be always shadowed in __init__().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant line?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer _data to imply dataset.data = newdata is not supported.

@carljm
Copy link
Member Author

carljm commented Jun 29, 2018

Thanks for the comments! Will update next week.

@methane
Copy link
Member

methane commented Jul 27, 2018

@carljm ping?

@carljm
Copy link
Member Author

carljm commented Jul 28, 2018

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@asvetlov, @methane: please review the changes made to this pull request.

Lib/functools.py Outdated
class cached_property:
def __init__(self, func):
self.func = func
self.attrname = func.__name__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks redundant since attrname is set in __set_name__()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you suggest to do instead? This is the most reasonable fallback in case __set_name__ is not called in some unusual use of this class, and it avoids the object ever being in an incompletely initialized state.

Lib/functools.py Outdated
self.lock = RLock()

def __set_name__(self, owner, name):
self.attrname = name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might make sense to raise error in this case, since a will never be cached:

class C:
    @cached_property
    def a(self):
        pass
    b = a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I guess that also answers the above question; we will raise error in case __set_name__ is ever called twice, and also I guess raise error if __get__ is called and __set_name__ was not.

@carljm
Copy link
Member Author

carljm commented Jul 29, 2018

@sir-sigurd How does that look?

@methane
Copy link
Member

methane commented Jul 30, 2018

@rhettinger Would you review this?

Lib/functools.py Outdated
else:
raise TypeError(
"Cannot assign the same cached_property to two names "
f"({self.attrname!r} and {name!r}) on the same class."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is inaccurate because the exception is also raised when it's assigned on the different classes.
I'm not sure that we should disallow to assign it on the different classes under the same name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'd like to avoid tracking owners, but it should be sufficient to just allow repeated __set_name__ as long as the name remains the same.

@carljm
Copy link
Member Author

carljm commented Jul 30, 2018

@sir-sigurd Better?

@sir-sigurd
Copy link
Contributor

@carljm 👍

Copy link
Contributor

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments and error checking suggestion related to the technicalities of why __dict__ may not exist, and why it may not support __setitem__, but otherwise this looks good to me.


.. note::

This decorator does not support classes which define ``__slots__``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd word this note slightly differently, as there are multiple ways to end up with instances that don't have a mutable mapping as their __dict__ attribute, and you can also include __dict__ in a __slots__ list if you want the fast C level access for selected fields, but aren't worried about the per-instance dict memory overhead.

Something like:

This decorator requires that the __dict__ attribute on each instance be a mutable mapping. This means it will not work with some types, such as metaclasses (since the __dict__ attributes on type instances are read-only proxies for the class namespace), and those that specify __slots__ without including __dict__ as one of the defined slots (as such classes don't provide a __dict__ attribute at all).

Lib/functools.py Outdated
"Cannot use cached_property instance without calling __set_name__ on it.")
try:
cache = instance.__dict__
except AttributeError: # objects with __slots__ have no __dict__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tweak the comment here along similar lines to the note in the docs

Lib/functools.py Outdated
# check if another thread filled cache while we awaited lock
val = cache.get(self.attrname, _NOT_FOUND)
if val is _NOT_FOUND:
val = cache[self.attrname] = self.func(instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the AttributeError above, it's likely worth wrapping TypeError here to account for cases like:

>>> str.__dict__["x"] = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'mappingproxy' object does not support item assignment

carljm added 2 commits August 19, 2018 12:40
* master: (107 commits)
  bpo-22057: Clarify eval() documentation (pythonGH-8812)
  bpo-34318: Convert deprecation warnings to errors in assertRaises() etc. (pythonGH-8623)
  bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences starting with "+". (pythonGH-8741)
  bpo-34415: Updated logging.Formatter docstring. (pythonGH-8811)
  bpo-34432: doc Mention complex and decimal.Decimal on str.format not about locales (pythonGH-8808)
  bpo-34381: refer to 'Running & Writing Tests' in README.rst (pythonGH-8797)
  Improve error message when mock.assert_has_calls fails (pythonGH-8205)
  Warn not to set SIGPIPE to SIG_DFL (python#6773)
  bpo-34419: selectmodule.c does not compile on HP-UX due to bpo-31938 (pythonGH-8796)
  bpo-34418: Fix HTTPErrorProcessor documentation (pythonGH-8793)
  bpo-34391: Fix ftplib test for TLS 1.3 (pythonGH-8787)
  bpo-34217: Use lowercase for windows headers (pythonGH-8472)
  bpo-34395: Fix memory leaks caused by incautious usage of PyMem_Resize(). (pythonGH-8756)
  bpo-34405: Updated to OpenSSL 1.1.0i for Windows builds. (pythonGH-8775)
  bpo-34384: Fix os.readlink() on Windows (pythonGH-8740)
  closes bpo-34400: Fix undefined behavior in parsetok(). (pythonGH-4439)
  bpo-34399: 2048 bits RSA keys and DH params (python#8762)
  Make regular expressions in test_tasks.py raw strings. (pythonGH-8759)
  smtplib documentation fixes (pythonGH-8708)
  Fix misindented yaml in logging how to example (pythonGH-8604)
  ...
@carljm
Copy link
Member Author

carljm commented Aug 19, 2018

@ncoghlan Thanks for the comments! I think they are all addressed in the latest update.

@ncoghlan ncoghlan merged commit d658dea into python:master Aug 28, 2018
@bedevere-bot
Copy link

@ncoghlan: Please replace # with GH- in the commit message next time. Thanks!

@codingpy
Copy link

codingpy commented Jun 8, 2019

haha, why not remove the underscore or rename it into cacheproperty like abstractproperty decorator, or reify like Pyramid?
If its name is neat and beautiful, I think it will be more Pythonic.
Just an opinion, never mind.

@wimglenn
Copy link
Contributor

wimglenn commented Feb 17, 2020

Thanks @carljm for this one, sensible to get it in stdlib since every man and his dog were writing their own versions. Is there any PyPI backport avail for people that want to use this stdlib implementation on older Python runtimes? Does the code use any datamodel features that may make copy-pasting the implementation unsafe (apart from the obvious f-strings)?

@carljm
Copy link
Member Author

carljm commented Feb 17, 2020

@wimglenn It's not exactly a backport, since (like many other versions) it predated this PR, and it has some additional bells and whistles not found here, but I'd probably use https://pypi.org/project/cached-property/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants