Skip to content

Conversation

@norberttech
Copy link
Member

@norberttech norberttech commented May 23, 2025

Change Log

Added

Fixed

  • parquet normalizer for nullable entries

Changed

Removed

Deprecated

Security


Description

Reference: #1628

@norberttech norberttech added this to the 0.17.0 milestone May 23, 2025
@norberttech norberttech moved this from Todo to In Progress in Roadmap May 23, 2025
@norberttech norberttech self-assigned this May 23, 2025
@github-actions
Copy link
Contributor

github-actions bot commented May 23, 2025

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
| benchmark             | subject                | revs | its | mem_peak        | mode             | rstdev          |
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
| CSVExtractorBench     | bench_extract_10k      | 1    | 3   | 4.775mb +0.01%  | 419.667ms +0.95% | ±1.34% +162.67% |
| ExcelExtractorBench   | bench_extract_10k_ods  | 1    | 3   | 65.487mb +0.00% | 1.038s -1.18%    | ±0.17% -64.00%  |
| ExcelExtractorBench   | bench_extract_10k_xlsx | 1    | 3   | 67.533mb +0.00% | 1.669s -3.00%    | ±0.64% -55.51%  |
| JsonExtractorBench    | bench_extract_10k      | 1    | 3   | 5.019mb +0.01%  | 1.289s +1.73%    | ±0.81% +16.83%  |
| ParquetExtractorBench | bench_extract_10k      | 1    | 3   | 86.321mb +0.00% | 936.559ms +2.30% | ±3.00% +214.84% |
| TextExtractorBench    | bench_extract_10k      | 1    | 3   | 4.499mb +0.01%  | 38.860ms +1.83%  | ±1.28% -11.62%  |
| XmlExtractorBench     | bench_extract_10k      | 1    | 3   | 4.494mb +0.01%  | 595.528ms -1.59% | ±0.35% +46.45%  |
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
Transformers
+---------------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                       | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+---------------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench     | bench_transform_10k_rows | 1    | 3   | 123.236mb +0.00% | 65.788ms +0.36% | ±0.40% -27.34% |
| RenameEachEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 18.498mb +0.00%  | 72.721ms +1.35% | ±0.04% -86.68% |
+---------------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode            | rstdev          |
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 62.435mb +0.00%  | 86.489ms +1.32% | ±2.38% +193.32% |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 79.706mb +0.00%  | 94.873ms -0.37% | ±2.25% +120.93% |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 165.388mb +0.50% | 20.668s +0.73%  | ±0.50% +132.24% |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 17.805mb +0.00%  | 31.237ms +1.40% | ±0.99% +18.08%  |
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
Building Blocks
+-------------------+----------------------------+------+-----+------------------+------------------+------------------+
| benchmark         | subject                    | revs | its | mem_peak         | mode             | rstdev           |
+-------------------+----------------------------+------+-----+------------------+------------------+------------------+
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 101.784mb +0.00% | 643.481ms -0.17% | ±1.75% +6.58%    |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 53.135mb +0.00%  | 323.427ms -1.85% | ±1.29% +15.18%   |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.385mb +0.00%  | 69.302ms +0.03%  | ±1.08% -46.68%   |
| RowsBench         | bench_chunk_10_on_10k      | 2    | 3   | 93.389mb +0.00%  | 3.516ms +6.90%   | ±1.62% -16.30%   |
| RowsBench         | bench_diff_left_1k_on_10k  | 2    | 3   | 110.759mb +0.00% | 237.788ms +0.69% | ±0.49% -3.78%    |
| RowsBench         | bench_diff_right_1k_on_10k | 2    | 3   | 93.479mb +0.00%  | 23.856ms -0.99%  | ±0.77% -52.80%   |
| RowsBench         | bench_drop_1k_on_10k       | 2    | 3   | 94.264mb +0.00%  | 1.452ms -13.14%  | ±3.83% +13.91%   |
| RowsBench         | bench_drop_right_1k_on_10k | 2    | 3   | 94.264mb +0.00%  | 1.468ms +9.66%   | ±1.22% +60.23%   |
| RowsBench         | bench_entries_on_10k       | 2    | 3   | 92.425mb +0.00%  | 3.319ms -2.36%   | ±0.98% -19.43%   |
| RowsBench         | bench_filter_on_10k        | 2    | 3   | 92.954mb +0.00%  | 15.640ms +0.12%  | ±0.79% -64.45%   |
| RowsBench         | bench_find_on_10k          | 2    | 3   | 92.954mb +0.00%  | 15.641ms -0.13%  | ±0.30% -51.38%   |
| RowsBench         | bench_find_one_on_10k      | 10   | 3   | 91.643mb +0.00%  | 1.906μs -4.41%   | ±2.44% +1.72%    |
| RowsBench         | bench_first_on_10k         | 10   | 3   | 91.643mb +0.00%  | 0.400μs 0.00%    | ±0.00% 0.00%     |
| RowsBench         | bench_flat_map_on_1k       | 2    | 3   | 100.703mb +0.00% | 14.504ms +0.98%  | ±1.85% +2.48%    |
| RowsBench         | bench_map_on_10k           | 2    | 3   | 130.130mb +0.00% | 66.371ms -0.16%  | ±0.60% -64.24%   |
| RowsBench         | bench_merge_1k_on_10k      | 2    | 3   | 93.473mb +0.00%  | 1.298ms +2.17%   | ±3.38% -0.51%    |
| RowsBench         | bench_partition_by_on_10k  | 2    | 3   | 96.841mb +0.00%  | 63.068ms +2.62%  | ±0.28% -58.74%   |
| RowsBench         | bench_remove_on_10k        | 2    | 3   | 94.526mb +0.00%  | 3.809ms +13.40%  | ±3.70% +1495.80% |
| RowsBench         | bench_sort_asc_on_1k       | 2    | 3   | 92.004mb +0.00%  | 40.965ms +1.01%  | ±0.98% -24.94%   |
| RowsBench         | bench_sort_by_on_1k        | 2    | 3   | 92.004mb +0.00%  | 40.057ms +0.20%  | ±1.01% +158.24%  |
| RowsBench         | bench_sort_desc_on_1k      | 2    | 3   | 92.004mb +0.00%  | 40.264ms +0.82%  | ±1.03% +112.55%  |
| RowsBench         | bench_sort_entries_on_1k   | 2    | 3   | 94.085mb +0.00%  | 8.280ms +1.96%   | ±0.31% -41.58%   |
| RowsBench         | bench_sort_on_1k           | 2    | 3   | 91.835mb +0.00%  | 30.051ms +1.92%  | ±0.56% +34.58%   |
| RowsBench         | bench_take_1k_on_10k       | 10   | 3   | 91.643mb +0.00%  | 14.812μs +8.11%  | ±1.79% +200.27%  |
| RowsBench         | bench_take_right_1k_on_10k | 10   | 3   | 91.643mb +0.00%  | 17.125μs +4.43%  | ±2.85% +14.50%   |
| RowsBench         | bench_unique_on_1k         | 2    | 3   | 110.759mb +0.00% | 241.820ms +1.13% | ±0.46% +4.07%    |
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 42.070mb +0.00%  | 425.492ms +0.98% | ±1.57% +393.64%  |
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 11.448mb +0.00%  | 85.269ms +1.00%  | ±0.77% +37.01%   |
+-------------------+----------------------------+------+-----+------------------+------------------+------------------+

@codecov
Copy link

codecov bot commented May 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.11%. Comparing base (1db7609) to head (e8f289f).
Report is 2 commits behind head on 1.x.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##              1.x    #1677   +/-   ##
=======================================
  Coverage   82.11%   82.11%           
=======================================
  Files         703      703           
  Lines       19062    19065    +3     
=======================================
+ Hits        15652    15655    +3     
  Misses       3410     3410           
Components Coverage Δ
etl 88.35% <ø> (+0.01%) ⬆️
cli 84.42% <ø> (ø)
lib-array-dot 94.53% <ø> (ø)
lib-azure-sdk 62.56% <ø> (ø)
lib-doctrine-dbal-bulk 90.11% <ø> (ø)
lib-filesystem 78.02% <ø> (ø)
lib-parquet 84.37% <ø> (ø)
lib-parquet-viewer 82.02% <ø> (ø)
lib-snappy 90.69% <ø> (-0.47%) ⬇️
bridge-filesystem-async-aws 90.38% <ø> (ø)
bridge-filesystem-azure 89.92% <ø> (ø)
bridge-monolog-http 96.38% <ø> (ø)
symfony-http-foundation 74.41% <ø> (ø)
adapter-chartjs 86.45% <ø> (ø)
adapter-csv 90.18% <ø> (ø)
adapter-doctrine 89.69% <ø> (ø)
adapter-elasticsearch 97.19% <ø> (ø)
adapter-google-sheet 83.87% <ø> (ø)
adapter-http 59.15% <ø> (ø)
adapter-json 90.62% <ø> (ø)
adapter-logger 53.84% <ø> (ø)
adapter-meilisearch 97.75% <ø> (ø)
adapter-parquet 78.64% <100.00%> (+0.21%) ⬆️
adapter-text 84.44% <ø> (ø)
adapter-xml 83.15% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@norberttech norberttech force-pushed the bug/parquet-entry-normalization branch from 8923a53 to e8f289f Compare May 23, 2025 11:48
@norberttech norberttech merged commit 8df3e3c into 1.x May 23, 2025
23 checks passed
@norberttech norberttech deleted the bug/parquet-entry-normalization branch May 23, 2025 11:57
@github-project-automation github-project-automation bot moved this from In Progress to Done in Roadmap May 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants