-
-
Notifications
You must be signed in to change notification settings - Fork 48
Closed
Milestone
Description
Hello !
Description
When trying to read a large file (1.3 GB) using flow-php/parquet with version ^0.24.0, I get a PHP fatal error due to memory limit (512MB).
The same code works with flow-php/parquet version 0.7.4.
Steps to Reproduce
- Implement the following
getReaderfunction:
public function getReader(FileModel $fileModel): mixed
{
$reader = new Reader();
try {
return $reader->read($fileModel->getTmpFileName());
} catch (Exception $e) {
throw new FileException(
sprintf(
'Can\'t access the temporary file %s %s %s',
$fileModel->getTmpFileName(),
$fileModel->getOriginalFileName(),
$e->getMessage()
)
);
}
}- Try to iterate over the file content:
$fileResource = $this->getReader($fileModel);
foreach ($fileResource->values(["col1", "col2"]) as $row) {
dump($row);
exit;
}- Run with a file of size 1.3 GB.
Expected Behavior
The file should be read row by row without exceeding the PHP memory limit.
Actual Behavior
Execution fails after some time with a PHP Fatal error (memory limit 512MB) before entering the foreach loop.
Additional Attempts
I also tried using readStream:
return $reader->readStream(
NativeLocalSourceStream::open(
new Path($fileModel->getTmpFileName())
)
);But this does not work with:
"flow-php/etl": "^0.24.0",
"flow-php/parquet": "^0.24.0"It only works with:
"flow-php/parquet": "0.7.4"Environment
- PHP version: [8.4]
- OS: [bookwork-dockerised]
- Memory limit: 512MB
flow-php/etl: ^0.24.0flow-php/parquet: ^0.24.0