Skip to content

Conversation

@broncha
Copy link
Contributor

@broncha broncha commented May 27, 2025

Resolves: #1669

Change Log


Added

  • Add withBomRemoval() method to enable/disable BOM removal (enabled by default)

@broncha broncha requested a review from norberttech as a code owner May 27, 2025 10:06
@norberttech
Copy link
Member

I see that cs fixer is complaining, you can find explanation about how to prepare local development environment here: https://flow-php.com/documentation/contributing/environment/ (it also explains which commands to run locally before commiting)

@github-actions
Copy link
Contributor

github-actions bot commented May 27, 2025

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
| benchmark             | subject                | revs | its | mem_peak        | mode             | rstdev          |
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
| CSVExtractorBench     | bench_extract_10k      | 1    | 3   | 4.788mb +0.02%  | 415.593ms -0.66% | ±0.49% -73.43%  |
| ExcelExtractorBench   | bench_extract_10k_ods  | 1    | 3   | 65.499mb +0.00% | 1.057s +1.00%    | ±0.19% -73.34%  |
| ExcelExtractorBench   | bench_extract_10k_xlsx | 1    | 3   | 67.545mb +0.00% | 1.679s -1.94%    | ±1.58% -1.32%   |
| JsonExtractorBench    | bench_extract_10k      | 1    | 3   | 5.032mb +0.00%  | 1.294s +1.50%    | ±1.62% +180.93% |
| ParquetExtractorBench | bench_extract_10k      | 1    | 3   | 86.334mb +0.00% | 924.641ms -0.29% | ±0.62% +93.62%  |
| TextExtractorBench    | bench_extract_10k      | 1    | 3   | 4.512mb +0.01%  | 38.804ms -0.38%  | ±1.18% -34.16%  |
| XmlExtractorBench     | bench_extract_10k      | 1    | 3   | 4.507mb +0.01%  | 602.836ms +0.32% | ±0.86% +19.86%  |
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
Transformers
+---------------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                       | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+---------------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench     | bench_transform_10k_rows | 1    | 3   | 123.248mb +0.00% | 66.405ms +0.64% | ±1.02% -44.93% |
| RenameEachEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 18.510mb +0.00%  | 73.342ms -0.26% | ±0.92% +51.62% |
+---------------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+-----------------+----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode            | rstdev         |
+--------------------+----------------+------+-----+------------------+-----------------+----------------+
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 62.448mb +0.00%  | 84.697ms -3.14% | ±0.43% -76.25% |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 79.719mb +0.00%  | 96.860ms -1.07% | ±1.26% -18.06% |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 166.224mb +0.00% | 21.117s +0.67%  | ±0.94% +73.70% |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 17.818mb +0.00%  | 30.900ms -1.72% | ±0.78% +16.21% |
+--------------------+----------------+------+-----+------------------+-----------------+----------------+
Building Blocks
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark         | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 101.797mb +0.00% | 650.005ms -0.96% | ±0.67% -77.75%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 53.148mb +0.00%  | 333.406ms +2.33% | ±0.85% +44.33%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.398mb +0.00%  | 69.937ms -1.03%  | ±1.11% +23.54%  |
| RowsBench         | bench_chunk_10_on_10k      | 2    | 3   | 93.402mb +0.00%  | 4.176ms +3.54%   | ±2.23% +34.56%  |
| RowsBench         | bench_diff_left_1k_on_10k  | 2    | 3   | 110.771mb +0.00% | 241.615ms +2.25% | ±0.75% +52.28%  |
| RowsBench         | bench_diff_right_1k_on_10k | 2    | 3   | 93.491mb +0.00%  | 24.743ms +2.33%  | ±0.81% -1.85%   |
| RowsBench         | bench_drop_1k_on_10k       | 2    | 3   | 94.276mb +0.00%  | 1.928ms +2.10%   | ±2.71% +186.70% |
| RowsBench         | bench_drop_right_1k_on_10k | 2    | 3   | 94.276mb +0.00%  | 1.851ms +3.86%   | ±2.29% -34.12%  |
| RowsBench         | bench_entries_on_10k       | 2    | 3   | 92.437mb +0.00%  | 3.730ms +3.13%   | ±0.87% +10.54%  |
| RowsBench         | bench_filter_on_10k        | 2    | 3   | 92.966mb +0.00%  | 15.971ms -9.21%  | ±1.23% +26.69%  |
| RowsBench         | bench_find_on_10k          | 2    | 3   | 92.966mb +0.00%  | 16.070ms -7.67%  | ±1.84% +212.06% |
| RowsBench         | bench_find_one_on_10k      | 10   | 3   | 91.655mb +0.00%  | 2.000μs 0.00%    | ±0.00% 0.00%    |
| RowsBench         | bench_first_on_10k         | 10   | 3   | 91.655mb +0.00%  | 0.400μs 0.00%    | ±0.00% 0.00%    |
| RowsBench         | bench_flat_map_on_1k       | 2    | 3   | 100.715mb +0.00% | 15.604ms +3.36%  | ±2.72% +171.14% |
| RowsBench         | bench_map_on_10k           | 2    | 3   | 130.143mb +0.00% | 67.764ms -3.38%  | ±0.49% -62.54%  |
| RowsBench         | bench_merge_1k_on_10k      | 2    | 3   | 93.486mb +0.00%  | 1.630ms -8.13%   | ±3.46% +74.01%  |
| RowsBench         | bench_partition_by_on_10k  | 2    | 3   | 96.854mb +0.00%  | 63.157ms -1.95%  | ±0.50% +53.51%  |
| RowsBench         | bench_remove_on_10k        | 2    | 3   | 94.539mb +0.00%  | 3.961ms +1.26%   | ±2.07% +197.32% |
| RowsBench         | bench_sort_asc_on_1k       | 2    | 3   | 92.016mb +0.00%  | 40.702ms +2.36%  | ±1.41% -9.65%   |
| RowsBench         | bench_sort_by_on_1k        | 2    | 3   | 92.016mb +0.00%  | 40.883ms -1.04%  | ±0.65% -54.38%  |
| RowsBench         | bench_sort_desc_on_1k      | 2    | 3   | 92.016mb +0.00%  | 40.372ms -2.55%  | ±0.81% -77.88%  |
| RowsBench         | bench_sort_entries_on_1k   | 2    | 3   | 94.098mb +0.00%  | 8.378ms -3.48%   | ±1.71% -34.66%  |
| RowsBench         | bench_sort_on_1k           | 2    | 3   | 91.848mb +0.00%  | 30.006ms -1.05%  | ±0.30% -51.96%  |
| RowsBench         | bench_take_1k_on_10k       | 10   | 3   | 91.655mb +0.00%  | 14.594μs -0.76%  | ±0.32% +1.14%   |
| RowsBench         | bench_take_right_1k_on_10k | 10   | 3   | 91.655mb +0.00%  | 16.944μs +1.17%  | ±2.18% -1.79%   |
| RowsBench         | bench_unique_on_1k         | 2    | 3   | 110.772mb +0.00% | 243.892ms +1.86% | ±1.86% +104.92% |
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 42.083mb +0.00%  | 421.222ms -1.32% | ±0.68% +70.69%  |
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 11.461mb +0.00%  | 85.378ms -0.39%  | ±1.37% +355.09% |
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+

@codecov
Copy link

codecov bot commented May 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.93%. Comparing base (02ff323) to head (d63b026).
Report is 2 commits behind head on 1.x.

Additional details and impacted files
@@            Coverage Diff             @@
##              1.x    #1685      +/-   ##
==========================================
- Coverage   81.93%   81.93%   -0.01%     
==========================================
  Files         705      705              
  Lines       19204    19209       +5     
==========================================
+ Hits        15734    15738       +4     
- Misses       3470     3471       +1     
Components Coverage Δ
etl 88.35% <ø> (ø)
cli 84.42% <ø> (ø)
lib-array-dot 94.53% <ø> (ø)
lib-azure-sdk 62.56% <ø> (ø)
lib-doctrine-dbal-bulk 93.49% <ø> (ø)
lib-filesystem 78.02% <ø> (ø)
lib-types 55.76% <ø> (ø)
lib-parquet 84.37% <ø> (ø)
lib-parquet-viewer 82.02% <ø> (ø)
lib-snappy 90.69% <ø> (-0.47%) ⬇️
bridge-filesystem-async-aws 90.38% <ø> (ø)
bridge-filesystem-azure 89.92% <ø> (ø)
bridge-monolog-http 96.38% <ø> (ø)
symfony-http-foundation 74.41% <ø> (ø)
adapter-chartjs 86.45% <ø> (ø)
adapter-csv 90.37% <100.00%> (+0.18%) ⬆️
adapter-doctrine 89.95% <ø> (ø)
adapter-elasticsearch 97.19% <ø> (ø)
adapter-google-sheet 83.87% <ø> (ø)
adapter-http 59.15% <ø> (ø)
adapter-json 90.62% <ø> (ø)
adapter-logger 53.84% <ø> (ø)
adapter-meilisearch 97.75% <ø> (ø)
adapter-parquet 78.64% <ø> (ø)
adapter-text 84.44% <ø> (ø)
adapter-xml 83.15% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@norberttech norberttech enabled auto-merge (squash) May 29, 2025 14:59
Copy link
Member

@norberttech norberttech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!! 🙏

@norberttech norberttech merged commit 298c656 into flow-php:1.x May 29, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CSV - BOM detection

2 participants