Skip to content

Conversation

@norberttech
Copy link
Member

@norberttech norberttech commented Oct 31, 2023

Change Log

Added

  • Allow to write rows in batches into file and streams

Fixed

Changed

Removed

Deprecated

Security


Description

Closes: #689

@github-actions
Copy link
Contributor

github-actions bot commented Oct 31, 2023

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+------------------+-------------------+-----------------+
| benchmark             | subject           | revs | its | mem_peak         | mode              | rstdev          |
+-----------------------+-------------------+------+-----+------------------+-------------------+-----------------+
| AvroExtractorBench    | bench_extract_10k | 1    | 3   | 44.121mb +0.00%  | 482.268ms -9.40%  | ±1.39% -8.24%   |
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 14.005mb +0.00%  | 390.136ms -10.41% | ±2.59% +31.83%  |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 18.667mb +0.00%  | 770.451ms -6.10%  | ±0.67% -78.38%  |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 237.768mb +0.00% | 1.106s -10.56%    | ±1.14% +513.28% |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 7.275mb +0.01%   | 20.698ms -13.70%  | ±3.34% +6.69%   |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 7.622mb +0.01%   | 717.945ms -6.64%  | ±3.27% +90.83%  |
+-----------------------+-------------------+------+-----+------------------+-------------------+-----------------+
Transformers
+-----------------------------+--------------------------+------+-----+-----------------+------------------+----------------+
| benchmark                   | subject                  | revs | its | mem_peak        | mode             | rstdev         |
+-----------------------------+--------------------------+------+-----+-----------------+------------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 87.061mb +0.00% | 77.919ms -12.38% | ±2.47% -29.39% |
+-----------------------------+--------------------------+------+-----+-----------------+------------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+-------------------+----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode              | rstdev         |
+--------------------+----------------+------+-----+------------------+-------------------+----------------+
| AvroLoaderBench    | bench_load_10k | 1    | 3   | 93.222mb +0.00%  | 762.701ms -15.24% | ±1.23% +79.60% |
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 47.138mb +0.00%  | 77.768ms -20.80%  | ±0.57% -64.47% |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 88.572mb +0.00%  | 80.780ms -9.70%   | ±0.66% -36.68% |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 281.204mb +0.00% | 1.106s -7.81%     | ±0.78% -26.56% |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 16.560mb +0.00%  | 40.640ms -12.45%  | ±1.63% +81.90% |
+--------------------+----------------+------+-----+------------------+-------------------+----------------+
Building Blocks
+-------------------------+----------------------------+------+-----+-----------------+-------------------+-----------------+
| benchmark               | subject                    | revs | its | mem_peak        | mode              | rstdev          |
+-------------------------+----------------------------+------+-----+-----------------+-------------------+-----------------+
| RowsBench               | bench_chunk_10_on_10k      | 2    | 3   | 60.685mb +0.00% | 4.842ms +0.22%    | ±0.71% -14.38%  |
| RowsBench               | bench_diff_left_1k_on_10k  | 2    | 3   | 80.475mb +0.00% | 203.364ms -8.47%  | ±0.95% -64.72%  |
| RowsBench               | bench_diff_right_1k_on_10k | 2    | 3   | 59.001mb +0.00% | 21.080ms -5.95%   | ±2.90% +152.55% |
| RowsBench               | bench_drop_1k_on_10k       | 2    | 3   | 59.824mb +0.00% | 3.283ms -6.59%    | ±0.67% +50.04%  |
| RowsBench               | bench_drop_right_1k_on_10k | 2    | 3   | 59.824mb +0.00% | 3.302ms -8.08%    | ±2.73% -11.40%  |
| RowsBench               | bench_entries_on_10k       | 2    | 3   | 59.037mb +0.00% | 4.695ms +4.31%    | ±0.89% -22.13%  |
| RowsBench               | bench_filter_on_10k        | 2    | 3   | 59.566mb +0.00% | 25.413ms -1.98%   | ±1.41% +129.17% |
| RowsBench               | bench_find_on_10k          | 2    | 3   | 59.565mb +0.00% | 25.930ms -6.43%   | ±1.86% -22.29%  |
| RowsBench               | bench_find_one_on_10k      | 10   | 3   | 57.637mb +0.00% | 2.400μs -4.01%    | ±3.40% +0.00%   |
| RowsBench               | bench_first_on_10k         | 10   | 3   | 57.637mb +0.00% | 0.500μs +25.00%   | ±0.00% -100.00% |
| RowsBench               | bench_flat_map_on_1k       | 2    | 3   | 65.870mb +0.00% | 15.131ms -12.06%  | ±1.48% -33.22%  |
| RowsBench               | bench_map_on_10k           | 2    | 3   | 91.390mb +0.00% | 70.853ms -10.04%  | ±1.06% +12.44%  |
| RowsBench               | bench_merge_1k_on_10k      | 2    | 3   | 60.087mb +0.00% | 3.500ms -9.14%    | ±2.05% +309.29% |
| RowsBench               | bench_partition_by_on_10k  | 2    | 3   | 62.355mb +0.00% | 51.985ms -1.32%   | ±2.86% +35.92%  |
| RowsBench               | bench_remove_on_10k        | 2    | 3   | 62.187mb +0.00% | 9.036ms -4.50%    | ±2.40% -1.07%   |
| RowsBench               | bench_sort_asc_on_1k       | 2    | 3   | 57.637mb +0.00% | 56.996ms -5.80%   | ±0.22% -73.75%  |
| RowsBench               | bench_sort_by_on_1k        | 2    | 3   | 57.637mb +0.00% | 56.843ms -6.28%   | ±1.53% +30.34%  |
| RowsBench               | bench_sort_desc_on_1k      | 2    | 3   | 57.637mb +0.00% | 58.660ms -4.59%   | ±1.32% -58.52%  |
| RowsBench               | bench_sort_entries_on_1k   | 2    | 3   | 59.911mb +0.00% | 10.721ms -5.19%   | ±1.78% +17.28%  |
| RowsBench               | bench_sort_on_1k           | 2    | 3   | 57.636mb +0.00% | 43.145ms -4.15%   | ±2.47% +53.43%  |
| RowsBench               | bench_take_1k_on_10k       | 10   | 3   | 57.637mb +0.00% | 25.059μs -21.28%  | ±3.68% +25.22%  |
| RowsBench               | bench_take_right_1k_on_10k | 10   | 3   | 57.637mb +0.00% | 30.557μs -18.76%  | ±3.13% +77.22%  |
| RowsBench               | bench_unique_on_1k         | 2    | 3   | 80.476mb +0.00% | 209.884ms -3.94%  | ±0.86% -43.10%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 91.745mb +0.02% | 168.969ms -15.18% | ±2.01% +253.19% |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 47.625mb +0.00% | 81.143ms -20.24%  | ±0.95% -56.90%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 12.412mb +0.05% | 20.831ms -10.22%  | ±1.18% +16.02%  |
+-------------------------+----------------------------+------+-----+-----------------+-------------------+-----------------+

@norberttech norberttech merged commit a8b51b3 into flow-php:1.x Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parquet - writing in batches

1 participant