Skip to content

Add support for LZ4 compression#1107

Merged
norberttech merged 1 commit intoflow-php:1.xfrom
flavioheleno:feat/lz4
Jul 4, 2024
Merged

Add support for LZ4 compression#1107
norberttech merged 1 commit intoflow-php:1.xfrom
flavioheleno:feat/lz4

Conversation

@flavioheleno
Copy link
Contributor

@flavioheleno flavioheleno commented Jul 3, 2024

Change Log

Added

  • Added support for LZ4 compression algorithm to parquet

Fixed

Changed

Removed

Deprecated

Security


Description

Add support for LZ4 compression.

Closes #783.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 3, 2024

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+------------------+------------------+------------------+
| benchmark             | subject           | revs | its | mem_peak         | mode             | rstdev           |
+-----------------------+-------------------+------+-----+------------------+------------------+------------------+
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 3.912mb +0.04%   | 511.742ms +0.71% | ±3.08% +1129.74% |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 3.944mb +0.04%   | 1.063s -1.11%    | ±0.95% -64.43%   |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 135.378mb +0.00% | 735.216ms -0.49% | ±0.80% +156.88%  |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 3.671mb +0.04%   | 33.514ms -0.94%  | ±0.36% -78.04%   |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 3.618mb +0.04%   | 433.853ms +0.74% | ±0.39% -75.53%   |
+-----------------------+-------------------+------+-----+------------------+------------------+------------------+
Transformers
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                   | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 115.964mb +0.00% | 60.352ms +0.79% | ±2.47% +81.71% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode            | rstdev          |
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 54.067mb +0.00%  | 84.376ms -0.67% | ±0.58% -38.68%  |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 106.499mb +0.00% | 52.001ms -2.75% | ±0.63% -29.58%  |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 225.835mb +0.00% | 1.396s +0.61%   | ±0.44% +150.69% |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 16.860mb +0.01%  | 43.636ms -1.58% | ±0.47% -9.09%   |
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
Building Blocks
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark               | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 116.514mb +0.00% | 495.351ms +1.69% | ±3.29% +25.51%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 59.992mb +0.00%  | 247.766ms -1.16% | ±3.26% +80.56%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.926mb +0.01%  | 52.707ms -1.63%  | ±1.18% +164.02% |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 59.693mb +0.00%  | 433.731ms +1.07% | ±3.59% +573.34% |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 14.232mb +0.01%  | 94.167ms +2.86%  | ±1.93% -46.04%  |
| RowsBench               | bench_chunk_10_on_10k      | 2    | 3   | 86.784mb +0.00%  | 3.387ms +5.13%   | ±1.03% +53.99%  |
| RowsBench               | bench_diff_left_1k_on_10k  | 2    | 3   | 102.382mb +0.00% | 187.891ms -1.95% | ±1.01% -40.27%  |
| RowsBench               | bench_diff_right_1k_on_10k | 2    | 3   | 85.102mb +0.00%  | 18.625ms -1.38%  | ±0.80% +89.01%  |
| RowsBench               | bench_drop_1k_on_10k       | 2    | 3   | 88.024mb +0.00%  | 1.705ms +0.85%   | ±0.44% -79.50%  |
| RowsBench               | bench_drop_right_1k_on_10k | 2    | 3   | 88.024mb +0.00%  | 1.719ms +0.40%   | ±1.64% -18.41%  |
| RowsBench               | bench_entries_on_10k       | 2    | 3   | 85.136mb +0.00%  | 2.555ms -0.69%   | ±1.13% -32.38%  |
| RowsBench               | bench_filter_on_10k        | 2    | 3   | 85.665mb +0.00%  | 16.999ms +14.24% | ±0.75% -46.85%  |
| RowsBench               | bench_find_on_10k          | 2    | 3   | 85.665mb +0.00%  | 16.778ms +13.64% | ±1.61% +130.23% |
| RowsBench               | bench_find_one_on_10k      | 10   | 3   | 83.569mb +0.00%  | 1.600μs 0.00%    | ±0.00% 0.00%    |
| RowsBench               | bench_first_on_10k         | 10   | 3   | 83.569mb +0.00%  | 0.300μs -25.00%  | ±0.00% -100.00% |
| RowsBench               | bench_flat_map_on_1k       | 2    | 3   | 92.919mb +0.00%  | 12.041ms -0.71%  | ±0.73% -49.62%  |
| RowsBench               | bench_map_on_10k           | 2    | 3   | 122.290mb +0.00% | 61.088ms -0.85%  | ±2.70% +87.87%  |
| RowsBench               | bench_merge_1k_on_10k      | 2    | 3   | 86.185mb +0.00%  | 1.198ms -3.98%   | ±0.47% -82.33%  |
| RowsBench               | bench_partition_by_on_10k  | 2    | 3   | 89.531mb +0.00%  | 62.147ms -0.07%  | ±1.92% -13.84%  |
| RowsBench               | bench_remove_on_10k        | 2    | 3   | 88.286mb +0.00%  | 3.880ms -1.37%   | ±0.61% -45.09%  |
| RowsBench               | bench_sort_asc_on_1k       | 2    | 3   | 83.712mb +0.00%  | 39.381ms +1.88%  | ±0.97% +250.92% |
| RowsBench               | bench_sort_by_on_1k        | 2    | 3   | 83.713mb +0.00%  | 39.367ms +0.63%  | ±0.71% +85.77%  |
| RowsBench               | bench_sort_desc_on_1k      | 2    | 3   | 83.712mb +0.00%  | 39.072ms +0.18%  | ±1.02% +142.87% |
| RowsBench               | bench_sort_entries_on_1k   | 2    | 3   | 86.010mb +0.00%  | 7.244ms -1.23%   | ±0.23% -72.69%  |
| RowsBench               | bench_sort_on_1k           | 2    | 3   | 83.569mb +0.00%  | 29.382ms +2.84%  | ±1.32% +80.92%  |
| RowsBench               | bench_take_1k_on_10k       | 10   | 3   | 83.569mb +0.00%  | 13.420μs 0.00%   | ±1.27% 0.00%    |
| RowsBench               | bench_take_right_1k_on_10k | 10   | 3   | 83.569mb +0.00%  | 15.888μs +1.88%  | ±0.60% +96.63%  |
| RowsBench               | bench_unique_on_1k         | 2    | 3   | 102.383mb +0.00% | 192.733ms +0.36% | ±1.09% +189.19% |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+

@norberttech norberttech merged commit 9436986 into flow-php:1.x Jul 4, 2024
@flavioheleno flavioheleno deleted the feat/lz4 branch July 4, 2024 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for LZ4 compression

2 participants