Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This is part of the larger project to implement StringViewArray -- see #5374
In #5481 we added support for StringViewArray and ByteViewArray.
The parquet crate has a reader and writer for reading/writing parquet data to arrow:
Describe the solution you'd like
I would like to be able to read a StringViewArray and BinaryViewArray directly from the reader and writer with no data copies (so the raw byte values are not copied).
- Add functionality
- Add tests
Describe alternatives you've considered
For example, I think we need to add the support to the writer here
|
ArrowDataType::Dictionary(_, value_type) => match value_type.as_ref() { |
|
ArrowDataType::Utf8 | ArrowDataType::LargeUtf8 | ArrowDataType::Binary | ArrowDataType::LargeBinary => { |
|
out.push(bytes(leaves.next().unwrap())) |
|
} |
|
_ => { |
|
out.push(col(leaves.next().unwrap())) |
|
} |
|
} |
|
_ => return Err(ParquetError::NYI( |
|
format!( |
|
"Attempting to write an Arrow type {data_type:?} to parquet that is not yet implemented" |
|
) |
|
)) |
|
} |
Additional context
The reader/writer already handles DictionaryArrays which I think could serve as a model for the view arrays.
@ariesdevil reports they are working on this feature #5374 (comment)
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This is part of the larger project to implement
StringViewArray-- see #5374In #5481 we added support for
StringViewArrayandByteViewArray.The parquet crate has a reader and writer for reading/writing parquet data to arrow:
Describe the solution you'd like
I would like to be able to read a
StringViewArrayandBinaryViewArraydirectly from the reader and writer with no data copies (so the raw byte values are not copied).Describe alternatives you've considered
For example, I think we need to add the support to the writer here
arrow-rs/parquet/src/arrow/arrow_writer/mod.rs
Lines 719 to 732 in f41c2a4
Additional context
The reader/writer already handles
DictionaryArrays which I think could serve as a model for the view arrays.@ariesdevil reports they are working on this feature #5374 (comment)