Skip to content

[Proposal]: Add batchBy() to group related records in batches #1900

@norberttech

Description

@norberttech

Describe the Proposal

Introduce batching strategy that keeps related records together.

Current batchSize() splits data by fixed count, which breaks parent-child relationships (e.g., order line items split across batches). This causes referential integrity issues when batch processing involves DELETE+INSERT operations.

API Adjustments

Add batchBy(string $column, int $maxSize = PHP_INT_MAX) method:

data_frame()
    ->read($orders_with_line_items)
    ->batchBy('order_id', maxSize: 1000)
    ->write($destination)
    ->run();

Are you intending to also work on proposed change?

Yes

Are you interested in sponsoring this change?

None

Integration & Dependencies

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions