Skip to content

Added RemoteFileListExtractor#883

Merged
norberttech merged 1 commit intoflow-php:1.xfrom
norberttech:feature/remote-file-list-extractor
Dec 20, 2023
Merged

Added RemoteFileListExtractor#883
norberttech merged 1 commit intoflow-php:1.xfrom
norberttech:feature/remote-file-list-extractor

Conversation

@norberttech
Copy link
Member

Change Log

Added

  • RemoteFileListExtractor

Fixed

Changed

Removed

Deprecated

Security


Description

Closes: #881

Due to technical limitations, remote_files() can't return as many file/directory properties are local_files() but it's still a lot.

@github-actions
Copy link
Contributor

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| benchmark             | subject           | revs | its | mem_peak         | mode             | rstdev          |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| AvroExtractorBench    | bench_extract_10k | 1    | 3   | 35.140mb +0.03%  | 718.031ms -0.18% | ±3.04% +149.84% |
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 4.839mb -0.77%   | 304.645ms -0.24% | ±0.30% -51.49%  |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 4.937mb -0.76%   | 932.955ms -0.09% | ±2.22% +139.92% |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 239.716mb -0.04% | 1.115s +0.90%    | ±0.75% -64.14%  |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 4.714mb -0.80%   | 27.511ms +0.36%  | ±1.36% +46.24%  |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 4.715mb -0.80%   | 412.544ms +0.65% | ±0.24% -42.44%  |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
Transformers
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                   | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 110.410mb -0.00% | 63.506ms +1.88% | ±0.55% -76.49% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode             | rstdev          |
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
| AvroLoaderBench    | bench_load_10k | 1    | 3   | 94.808mb -0.04%  | 462.489ms -0.96% | ±2.69% +104.43% |
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 54.887mb -0.07%  | 70.227ms -0.37%  | ±0.45% -30.46%  |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 105.469mb -0.04% | 55.593ms -2.22%  | ±0.63% +17.22%  |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 320.678mb -0.01% | 1.266s -0.44%    | ±2.89% +173.05% |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 17.763mb -0.21%  | 39.641ms -1.67%  | ±1.69% +80.24%  |
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
Building Blocks
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark               | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 116.046mb +0.00% | 383.256ms -1.37% | ±2.09% +5.02%   |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 59.764mb +0.01%  | 195.587ms +3.17% | ±1.79% +174.94% |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.839mb +0.02%  | 40.356ms -0.52%  | ±0.78% -74.57%  |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 59.395mb +0.01%  | 329.070ms -0.49% | ±2.24% +745.48% |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 14.319mb +0.02%  | 65.447ms +0.35%  | ±2.03% +183.63% |
| RowsBench               | bench_chunk_10_on_10k      | 2    | 3   | 76.461mb +0.00%  | 4.016ms +4.21%   | ±2.41% +15.00%  |
| RowsBench               | bench_diff_left_1k_on_10k  | 2    | 3   | 96.253mb +0.00%  | 177.923ms -0.69% | ±0.39% -48.32%  |
| RowsBench               | bench_diff_right_1k_on_10k | 2    | 3   | 74.779mb +0.00%  | 17.776ms -1.80%  | ±0.59% -5.55%   |
| RowsBench               | bench_drop_1k_on_10k       | 2    | 3   | 77.701mb +0.00%  | 1.896ms +14.70%  | ±0.41% -24.37%  |
| RowsBench               | bench_drop_right_1k_on_10k | 2    | 3   | 77.701mb +0.00%  | 1.647ms -0.66%   | ±1.92% +838.02% |
| RowsBench               | bench_entries_on_10k       | 2    | 3   | 74.812mb +0.00%  | 2.459ms -0.46%   | ±0.80% -5.63%   |
| RowsBench               | bench_filter_on_10k        | 2    | 3   | 75.342mb +0.00%  | 14.118ms -2.89%  | ±1.30% -57.17%  |
| RowsBench               | bench_find_on_10k          | 2    | 3   | 75.342mb +0.00%  | 14.138ms -7.37%  | ±1.52% -45.60%  |
| RowsBench               | bench_find_one_on_10k      | 10   | 3   | 73.244mb +0.00%  | 1.706μs +7.03%   | ±2.72% -9.62%   |
| RowsBench               | bench_first_on_10k         | 10   | 3   | 73.244mb +0.00%  | 0.300μs 0.00%    | ±0.00% 0.00%    |
| RowsBench               | bench_flat_map_on_1k       | 2    | 3   | 86.867mb +0.00%  | 12.569ms +0.07%  | ±2.32% +352.83% |
| RowsBench               | bench_map_on_10k           | 2    | 3   | 116.161mb +0.00% | 63.590ms +1.96%  | ±2.35% +27.05%  |
| RowsBench               | bench_merge_1k_on_10k      | 2    | 3   | 75.861mb +0.00%  | 1.374ms +11.85%  | ±2.88% +261.14% |
| RowsBench               | bench_partition_by_on_10k  | 2    | 3   | 78.135mb +0.00%  | 36.028ms +0.83%  | ±0.76% -55.67%  |
| RowsBench               | bench_remove_on_10k        | 2    | 3   | 77.963mb +0.00%  | 3.813ms -0.58%   | ±2.99% +865.97% |
| RowsBench               | bench_sort_asc_on_1k       | 2    | 3   | 73.390mb +0.00%  | 39.486ms +1.61%  | ±1.14% +759.89% |
| RowsBench               | bench_sort_by_on_1k        | 2    | 3   | 73.390mb +0.00%  | 39.066ms -1.38%  | ±0.59% -52.76%  |
| RowsBench               | bench_sort_desc_on_1k      | 2    | 3   | 73.390mb +0.00%  | 39.609ms +0.65%  | ±2.08% +129.40% |
| RowsBench               | bench_sort_entries_on_1k   | 2    | 3   | 75.687mb +0.00%  | 7.411ms +1.21%   | ±0.63% +45.79%  |
| RowsBench               | bench_sort_on_1k           | 2    | 3   | 73.244mb +0.00%  | 28.944ms -0.52%  | ±1.09% +54.38%  |
| RowsBench               | bench_take_1k_on_10k       | 10   | 3   | 73.244mb +0.00%  | 13.488μs -2.96%  | ±1.93% +0.00%   |
| RowsBench               | bench_take_right_1k_on_10k | 10   | 3   | 73.244mb +0.00%  | 16.470μs +0.32%  | ±1.44% +8.89%   |
| RowsBench               | bench_unique_on_1k         | 2    | 3   | 96.255mb +0.00%  | 182.424ms -0.03% | ±1.14% +111.91% |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+

@norberttech norberttech merged commit 98dd4f6 into flow-php:1.x Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Files Extractor

1 participant