Skip to content

Conversation

@norberttech
Copy link
Member

Change Log

Added

  • RDSL library

Fixed

Changed

Removed

Deprecated

Security


Description

RDSL is a simple library that allows to define a chain of execution of DSL functions.
It comes with access control lists, to exclude/include only specific functions from given namespaces to be a part of the DSL.
It also allows to define which functions are available as an entry points, also through ACL.

Usage:

<?php

$executables = $builder->parse(
    [
        [
            'function' => 'int',
            'args' => [0],
            'call' => [
                'method' => 'add',
                'args' => [
                    [
                        'function' => 'lit',
                        'args' => [5],
                    ],
                ],
            ],
        ],
    ]
);

$results = (new Executor())->execute($executables);

The goal is to define data_frame/df functions as an entry point from which we can call other functions.
ETL will parse pipelines from Yaml/Json/XML, turn them into associative arrays and build DataFrame.

It will all be possible through a static factory DataFrame::parse(array $definition) : DataFrame.
Then specific parses would just use that static factory, for example Flow\ETL\Parser\JsonParser::parse(string $json)

@github-actions
Copy link
Contributor

github-actions bot commented Dec 7, 2023

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| benchmark             | subject           | revs | its | mem_peak         | mode             | rstdev          |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| AvroExtractorBench    | bench_extract_10k | 1    | 3   | 35.123mb +0.02%  | 710.695ms +0.26% | ±0.99% -52.34%  |
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 4.822mb +0.16%   | 302.266ms +0.74% | ±0.77% +263.28% |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 4.920mb +0.15%   | 927.157ms -0.52% | ±0.48% +192.82% |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 239.626mb +0.00% | 1.129s +1.30%    | ±1.23% +14.49%  |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 4.699mb +0.16%   | 24.866ms +0.27%  | ±0.76% -6.12%   |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 4.701mb +0.16%   | 420.463ms +4.14% | ±1.82% +264.07% |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
Transformers
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                   | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 110.386mb +0.01% | 65.443ms +4.32% | ±0.22% -82.83% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+------------------+------------------+
| benchmark          | subject        | revs | its | mem_peak         | mode             | rstdev           |
+--------------------+----------------+------+-----+------------------+------------------+------------------+
| AvroLoaderBench    | bench_load_10k | 1    | 3   | 94.784mb +0.01%  | 456.957ms +1.84% | ±1.54% +33.98%   |
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 54.862mb +0.01%  | 71.562ms +0.10%  | ±3.13% +6263.68% |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 105.445mb +0.01% | 55.634ms +3.82%  | ±0.94% +242.23%  |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 320.654mb +0.00% | 1.256s -1.29%    | ±0.04% -96.85%   |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 17.739mb +0.04%  | 41.823ms +3.53%  | ±0.81% +101.61%  |
+--------------------+----------------+------+-----+------------------+------------------+------------------+
Building Blocks
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark               | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| RowsBench               | bench_chunk_10_on_10k      | 2    | 3   | 76.437mb +0.01%  | 2.891ms +1.76%   | ±3.23% +14.94%  |
| RowsBench               | bench_diff_left_1k_on_10k  | 2    | 3   | 96.230mb +0.01%  | 179.555ms -3.64% | ±0.68% -23.98%  |
| RowsBench               | bench_diff_right_1k_on_10k | 2    | 3   | 74.755mb +0.01%  | 18.281ms -4.53%  | ±0.50% -4.91%   |
| RowsBench               | bench_drop_1k_on_10k       | 2    | 3   | 77.677mb +0.01%  | 1.988ms +18.40%  | ±1.03% +139.75% |
| RowsBench               | bench_drop_right_1k_on_10k | 2    | 3   | 77.677mb +0.01%  | 1.933ms +15.03%  | ±2.64% +33.94%  |
| RowsBench               | bench_entries_on_10k       | 2    | 3   | 74.789mb +0.01%  | 3.107ms +25.67%  | ±1.39% +32.34%  |
| RowsBench               | bench_filter_on_10k        | 2    | 3   | 75.318mb +0.01%  | 16.701ms +15.72% | ±2.52% +361.53% |
| RowsBench               | bench_find_on_10k          | 2    | 3   | 75.318mb +0.01%  | 16.808ms +18.53% | ±1.68% -6.51%   |
| RowsBench               | bench_find_one_on_10k      | 10   | 3   | 73.220mb +0.01%  | 1.900μs +0.32%   | ±0.00% -100.00% |
| RowsBench               | bench_first_on_10k         | 10   | 3   | 73.220mb +0.01%  | 0.400μs 0.00%    | ±0.00% 0.00%    |
| RowsBench               | bench_flat_map_on_1k       | 2    | 3   | 86.777mb +0.01%  | 12.294ms -5.48%  | ±2.61% +10.13%  |
| RowsBench               | bench_map_on_10k           | 2    | 3   | 116.137mb +0.01% | 63.439ms +1.44%  | ±1.80% +139.60% |
| RowsBench               | bench_merge_1k_on_10k      | 2    | 3   | 75.838mb +0.01%  | 1.936ms +7.02%   | ±2.00% +122.11% |
| RowsBench               | bench_partition_by_on_10k  | 2    | 3   | 78.111mb +0.01%  | 35.929ms +3.62%  | ±1.75% +94.02%  |
| RowsBench               | bench_remove_on_10k        | 2    | 3   | 77.939mb +0.01%  | 3.931ms +3.28%   | ±2.34% +720.71% |
| RowsBench               | bench_sort_asc_on_1k       | 2    | 3   | 73.366mb +0.01%  | 40.294ms -1.87%  | ±0.95% -56.28%  |
| RowsBench               | bench_sort_by_on_1k        | 2    | 3   | 73.366mb +0.01%  | 40.370ms +1.23%  | ±2.64% +121.71% |
| RowsBench               | bench_sort_desc_on_1k      | 2    | 3   | 73.366mb +0.01%  | 39.662ms -2.95%  | ±1.98% -24.86%  |
| RowsBench               | bench_sort_entries_on_1k   | 2    | 3   | 75.663mb +0.01%  | 7.391ms +1.53%   | ±0.81% +87.59%  |
| RowsBench               | bench_sort_on_1k           | 2    | 3   | 73.220mb +0.01%  | 28.867ms -0.28%  | ±0.98% +179.27% |
| RowsBench               | bench_take_1k_on_10k       | 10   | 3   | 73.220mb +0.01%  | 13.638μs +2.49%  | ±2.87% +712.44% |
| RowsBench               | bench_take_right_1k_on_10k | 10   | 3   | 73.220mb +0.01%  | 15.800μs +0.42%  | ±0.52% -60.60%  |
| RowsBench               | bench_unique_on_1k         | 2    | 3   | 96.231mb +0.01%  | 178.932ms -4.11% | ±0.86% +2.58%   |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 59.381mb +0.01%  | 339.109ms -0.80% | ±0.92% -69.89%  |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 14.304mb +0.05%  | 65.653ms -0.32%  | ±0.58% -42.37%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 115.979mb +0.01% | 378.800ms +1.28% | ±0.84% +533.71% |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 59.698mb +0.01%  | 195.783ms +3.58% | ±1.26% +218.15% |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.822mb +0.04%  | 41.684ms +0.87%  | ±1.41% -30.73%  |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+

@github-actions github-actions bot added the core label Dec 7, 2023
tools: composer:v2
php-version: "${{ matrix.php-version }}"
ini-values: memory_limit=-1
extensions: :psr
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly not sure why suddently psr extension become installed but it was breaking the testsuite, disabled according to docs

@norberttech norberttech merged commit 290ea18 into flow-php:1.x Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant