Skip to content

Conversation

@pmeier
Copy link
Contributor

@pmeier pmeier commented Feb 3, 2023

We had an offline discussion about this and the consensus is that right now there is no critical need for the labels to have a categories field. It is nice to have for visualization and if there is user demand for it later we can re-add it. Still, as is it is a liability since we haven't done any perf tests with it and there might be other implications from sharing the same list of categories between all label instances coming from a dataset.

This PR removes the attribute and all functionality that comes with it. Diff is so large because I ported the "old" behavior over to torchvision.prototype.datasets since they use the attribute quite a bit.

cc @bjuncek

Example:
>>> class BatchMultiCrop(transforms.Transform):
... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.Label]):
... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.LabelWithCategories]):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, auto refactoring was a little to eager. Need to revert this.

num_categories = len(inpt.categories)
output = one_hot(inpt.as_subclass(torch.Tensor), num_classes=num_categories)
return datapoints.OneHotLabel(output, categories=inpt.categories)
return datapoints.OneHotLabel(output)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does change the functionality, but not by much. Previously, users could use this transform and it would do the right thing by default if the label had some categories attached to it. Now they have to explicitly pass it to the constructor.

Note that the -1 default is somewhat weird. It is a sentinel to let the torch kernel know to "figure it out". This means it checks all values available in the input and infers the number of categories from them. Obviously, if the input doesn't contain the smallest and largest value, this will give false results.

@pmeier
Copy link
Contributor Author

pmeier commented Feb 6, 2023

We (@NicolasHug, @vfdev-5, and I) had a longer offline discussion about this and decided to not include the datapoints.Label and datapoints.OneHotLabel class in the first iteration of the transforms v2 release at all. I'll post a longer explanation for this and all other changes that we made the last couple of weeks compared to the initial announcement in #6753 as soon as we are sure that we can release the rest.

Closing this PR as we don't need to remove the categories any longer, if we don't release the label datapoints in the first place.

@pmeier pmeier closed this Feb 6, 2023
@pmeier pmeier deleted the rip-label-categories branch February 9, 2023 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants