Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Part of implementing StringView #5374
@XiangpengHao implemented gc which compacts all the strings in a StringView/BinaryView into contiguous storage in #5513
However, that functionality does not deduplicate/intern the strings -- it just copies them over
Describe the solution you'd like
We should make it easy to deduplicate the strings in a StringView.
I do think we should change gc to do deduplication without an explict as (as deduplication is expensive)
Describe alternatives you've considered
- Do nothing (users can implement their own version of this code without any addtional apis)
- Add a new function (e.g.
GenericBinaryView::dedupe) that deduplicated such arrays (likely not moving any strings, but just updating views)
- Add an argument to
GenericBinaryView::gc that controlled the behavior (as in could also specify doing gc)
Additional context
@alexwilcoxson-rel asked in #5904 (comment)
Can/will this incorporate deduping/interning/implicitly using the gc function that landed recently?
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Part of implementing
StringView#5374@XiangpengHao implemented
gcwhich compacts all the strings in a StringView/BinaryView into contiguous storage in #5513However, that functionality does not deduplicate/intern the strings -- it just copies them over
Describe the solution you'd like
We should make it easy to deduplicate the strings in a StringView.
I do think we should change
gcto do deduplication without an explict as (as deduplication is expensive)Describe alternatives you've considered
GenericBinaryView::dedupe) that deduplicated such arrays (likely not moving any strings, but just updating views)GenericBinaryView::gcthat controlled the behavior (as in could also specify doing gc)Additional context
@alexwilcoxson-rel asked in #5904 (comment)