-
-
Notifications
You must be signed in to change notification settings - Fork 48
Description
Currently, when dealing with just extracted rows, we need to properly handle them, this is how it looks now:
(new Flow())
->read(
From::array(
[['id' => 1, 'array' => ['a' => 1, 'b' => 2, 'c' => 3]]],
)
)
->withEntry('row', ref('row')->unpack())
->renameAll('row.', '')
->drop('row')
->withEntry('array', ref('array')->arrayMerge(lit(['d' => 4])))
->write(To::memory($memory = new ArrayMemory()))
->run();Following lines are repeated almost* always
->withEntry('row', ref('row')->unpack())
->renameAll('row.', '')
->drop('row')We should look into this and introduce an expression that will do all of those 3 things, something like:
ref('row')->rowUnpack()
One might ask, why extractors are exposing the entire row under ArrayEntry.
The reason comes from Config::shouldPutInputIntoRows() option.
When this option is set to true, extractors will provide also additional data from the input, for example
file based extractors with that option will return something like this:
[
'input_file_uri' => 'string',
'row' => [...]
]
Or Http extractor will put there headers, request URL etc.
This is very handy when for example our datasets are stored like this:
- /datasets/sales/2023/january.json
- /datasets/sales/2023/february.json
- /datasets/reports/report_type/123456.xml
- /datasets/reports/report_type/789010.xml
Thanks to input_file_uri we can also parse input file path and get the month name or report id from it.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status