**Reporter**: [Neal Richardson](https://issues.apache.org/jira/browse/ARROW-12633) / @nealrichardson #### Related issues: - [[C++][Compute] Implement TopK/BottomK](https://github.com/apache/arrow/issues/17579) (is a parent of) - [[C++][Compute] Wrap grouped aggregation in an ExecNode](https://github.com/apache/arrow/issues/28501) (is a parent of) - [[C++][Compute] Provide a registry for ExecNode implementations](https://github.com/apache/arrow/issues/29144) (is a parent of) - [TPC-H Data Generator Node](https://github.com/apache/arrow/issues/175794) (is a parent of) - [[C++][Compute] Add support for imperfect grouping for use in radix partitioning](https://github.com/apache/arrow/issues/27873) (is a parent of) - [[C++][Compute] Use generic hash-aggregate for DictionaryArrays](https://github.com/apache/arrow/issues/28104) (is a parent of) - [[C++][Compute] GroupBy: add unittests for individual components of hash group by](https://github.com/apache/arrow/issues/28466) (is a parent of) - [[C++][Compute] GroupBy: add parallelism to hash group by](https://github.com/apache/arrow/issues/28468) (is a parent of) - [[C++][Compute] GroupBy: support more than 2^32 groups](https://github.com/apache/arrow/issues/28469) (is a parent of) - [[C++][Compute] Make GroupBy optimizations work on Big Endian architecture](https://github.com/apache/arrow/issues/28564) (is a parent of) - [[C++][Compute] Support tagging ExecBatches with arbitrary extra information](https://github.com/apache/arrow/issues/18681) (is a parent of) - [[C++] Add StopToken to ExecNode](https://github.com/apache/arrow/issues/28974) (is a parent of) - [[C++][Compute] Add Find method to Grouper](https://github.com/apache/arrow/issues/29341) (is a parent of) - [[C++][Compute] Add residual predicate support to new (Swiss) hash join](https://github.com/apache/arrow/issues/20339) (is a parent of) - [[C++][Compute] Add dictionary support to new (Swiss) hash join](https://github.com/apache/arrow/issues/32494) (is a parent of) - [[C++][Acero] Add Window Functions exec node](https://github.com/apache/arrow/issues/32813) (is a parent of) - [[C++] Take kernel can't handle ChunkedArrays that don't fit in an Array](https://github.com/apache/arrow/issues/25822) (is a parent of) - [[C++][Compute] Implement many-to-many inner hash join](https://github.com/apache/arrow/issues/29280) (is a parent of) - [[C++][Compute] Hash Join support for dictionary ](https://github.com/apache/arrow/issues/29767) (is a parent of) - [[C++] Measure microperformance associated with ExecBatchIterator](https://github.com/apache/arrow/issues/25058) (is a parent of) - [[C++][Compute] Add ExecNode hierarchy](https://github.com/apache/arrow/issues/27765) (is a parent of) - [[C++][Dataset][Compute] Refactor Dataset scans to use an ExecNode graph](https://github.com/apache/arrow/issues/27767) (is a parent of) - [[C++][Dataset][Compute] Replace UnionDataset with Union ExecNode](https://github.com/apache/arrow/issues/27813) (is a parent of) - [[C++][Compute] Improve performance of the hash table used in GroupIdentifier](https://github.com/apache/arrow/issues/27841) (is a parent of) - [[C++][Compute] GroupBy: improve performance by encoding keys in row format only when they are inserted into hash table](https://github.com/apache/arrow/issues/28467) (is a parent of) - [[C++][Compute] Implement count_distinct/distinct hash aggregate kernels ](https://github.com/apache/arrow/issues/28470) (is a parent of) - [[C++][Compute] Document ExecNode, ExecPlan](https://github.com/apache/arrow/issues/18728) (is a parent of) - [[C++][Dataset][Compute] Substitute ExecPlan impl for dataset scans](https://github.com/apache/arrow/issues/28923) (is a parent of) - [[C++][Compute] Add ExecNode for semi and anti-semi join](https://github.com/apache/arrow/issues/28949) (is a parent of) - [[C++][Compute] Add ScalarAggregateNode](https://github.com/apache/arrow/issues/28989) (is a parent of) - [[C++][Compute] Join: add set membership test method to the grouper](https://github.com/apache/arrow/issues/29186) (is a parent of) - [[C++][Compute] Add OrderByNode for ordering of rows in an ExecPlan](https://github.com/apache/arrow/issues/29192) (is a parent of) - [[C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk](https://github.com/apache/arrow/issues/29194) (is a parent of) - [[C++][Compute] Replace ExecNode::InputReceived with ::MakeTask](https://github.com/apache/arrow/issues/29223) (is a parent of) - [[C++][Compute] Hash Join performance improvement](https://github.com/apache/arrow/issues/29768) (is a parent of) - [[C++][Compute] Introduce Bloom filters to hash join](https://github.com/apache/arrow/issues/30736) (is a parent of) - [[C++][Compute] Implement Bloom filter pushdown between hash joins ](https://github.com/apache/arrow/issues/30973) (is a parent of) - [[C++][Compute] Implement outer join with support for residual predicates](https://github.com/apache/arrow/issues/29281) (is a parent of) <sub>**Note**: *This issue was originally created as [ARROW-12633](https://issues.apache.org/jira/browse/ARROW-12633). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>
Reporter: Neal Richardson / @nealrichardson
Related issues:
Note: This issue was originally created as ARROW-12633. Please see the migration documentation for further details.