Currently KernelExecutor handles preallocation of null bitmaps and other buffers based on simple flags on each Kernel. This is not very flexible and we end up leaving a lot of performance on the table in cases where we can preallocate but the behavior can't be captured in the available flags. For example, in the case of binary_string_join_element_wise, it would be possible to preallocate all buffers (even the character buffer) and write output into slices.
Having this as a public function would enable us to unit test it directly (currently Executors are only tested indirectly through calling of compute::Functions) and reuse it, for example to correctly preallocate a small temporary for pipelined execution
One way this could be added is as a new method on each Kernel:
// Output preallocated Datums sufficient for execution of the kernel on each ExecBatch.
// The output Datums may not be identically chunked to the input batches, for example
// kernels which support contiguous output preallocation will preallocate a single Datum
// (and can then output into slices of that Datum).
Result<std::vector<Datum>> Kernel::prepare_output(
const Kernel*,
KernelContext*,
const std::vector<ExecBatch>& inputs)
Reporter: Ben Kietzman / @bkietz
Related issues:
Note: This issue was originally created as ARROW-13121. Please see the migration documentation for further details.
Currently KernelExecutor handles preallocation of null bitmaps and other buffers based on simple flags on each Kernel. This is not very flexible and we end up leaving a lot of performance on the table in cases where we can preallocate but the behavior can't be captured in the available flags. For example, in the case of
binary_string_join_element_wise, it would be possible to preallocate all buffers (even the character buffer) and write output into slices.Having this as a public function would enable us to unit test it directly (currently Executors are only tested indirectly through calling of compute::Functions) and reuse it, for example to correctly preallocate a small temporary for pipelined execution
One way this could be added is as a new method on each Kernel:
Reporter: Ben Kietzman / @bkietz
Related issues:
Note: This issue was originally created as ARROW-13121. Please see the migration documentation for further details.