Releases: apache/arrow-rs
arrow 58.3.0
Changelog
58.3.0 (2026-05-07)
Implemented enhancements:
- Add
DatePart::from_strAPI #9930 [arrow] - should use DictionaryArray::with_values instead of try_new on the dictionary fast path #9889 [arrow]
- [arrow-string] add concat_elements for BinaryViewArray and FixedSizeBinary #9875 [arrow]
- Expose eq ignore ascii case from arrow-string #9870 [arrow]
- Configurable data page v2 compression threshold #9827 [parquet]
Fixed bugs:
- [arrow-cast] incorrect Time32 -> Time64 conversion #9851 [arrow]
- Panic when reading malformed compact-Thrift bool fields in Parquet page metadata #9839 [parquet]
- Parquet
DeltaBitPackDecoder::skipcould panic on "non-standard" miniblocks #9793 [parquet]
Documentation updates:
- docs: Add guidance for AI assisted submissions to CONTRIBUTING.md #9892 (etseidl)
- Update release schedule on README #9881 (alamb)
- Add more documentation for FixedSizeBinary arrays #9866 [arrow] (alamb)
- Minor: document why FixedSizeBinary offset is always 0 #9861 [arrow] (alamb)
- docs: Update contributing guidelines with benchmark results #9782 (alamb)
Closed issues:
- GenericByteDictionaryBuilder::with_capacity does not pre-size dedup HashTable #9907 [arrow]
- [arrow-buffer] Integer overflow in repeat_slice_n_times leads to undefined behavior #9904 [arrow]
- [arrow-buffer] Integer overflow in BitChunks::new leads to undefined behavior #9903 [arrow]
- [arrow-row] Integer overflow in Rows::row index handling leads to undefined behavior #9901 [arrow]
- [arrow-data] Integer overflow in ArrayData validation leads to undefined behavior #9900 [arrow]
- [arrow-data] Integer overflow in ArrayData::slice leads to undefined behavior #9899 [arrow]
- [arrow-array] Integer overflow in FixedSizeBinaryArray::value leads to undefined behavior #9898 [arrow]
- [arrow-buffer] Integer overflow in BufferBuilder::reserve leads to undefined behavior #9897 [arrow]
- arrow-csv: integer overflow panic in Reader::records::flush #9885 [arrow]
- Make an API to help with the pattern of 'replaces the values of the REE array' #9854 [arrow]
- Parquet reader rejects canonical UNKNOWN logical type on BOOLEAN physical columns #9844 [parquet]
- ColumnIndex length mismatch can cause panic during decoding in Parquet #9832 [parquet]
- Bug converting json to fixed list of zero size #9780 [arrow]
Merged pull requests:
- impl
FromStrforDatePart#9931 [arrow] (sdf-jkl) - Pre-size dedup HashTable in GenericByteDictionaryBuilder::with_capacity #9908 [arrow] (rabenhorst)
- [arrow-array] Use consistent
value_lengthname in FixedSizeBinaryArray #9905 [arrow] (alamb) - replace Dictionary::try_new() calls with with_values. #9894 [arrow] (Rich-T-kid)
- API to help with the pattern of 'replaces the values of the REE array #9891 [arrow] (Rich-T-kid)
- fix(arrow-csv): bound RecordDecoder::flush offset accumulation #9886 [arrow] (masumi-ryugo)
- fix(parquet): bound schema num_children before Vec::with_capacity #9884 [parquet] (masumi-ryugo)
- feat(arrow-string): concat_elements for view, fixed binary #9876 [arrow] (theirix)
- Prevent
FixedSizeBinaryArrayi32offset overflows (try 2) #9872 [arrow] (alamb) - [arrow-string]: add
like::eq_ascii_ignore_casekernel #9871 [arrow] (albertlockett) - fix(parquet): Prevent negative list sizes in Thrift compact protocol parser #9868 [parquet] (masumi-ryugo)
- [PARQUET] Allow
UNKNOWNlogical type annotation on any physical type #9855 [parquet] (etseidl) - [arrow-ipc]: dictionary builders for delta - doc fix and integration tests for nested types #9853 [arrow] (albertlockett)
- fix(arrow-cast): fix incorrect conversion #9852 [arrow] (bboissin)
- chore[benches]: add REE interleave benchmarks #9849 [arrow] (asubiotto)
- test(parquet): replace
InMemoryArrayReaderwithPrimitiveArrayReaderin tests #9847 [parquet] (HippoBaro) - REE row conversion speed up #9845 [arrow] (Rich-T-kid)
- fix(parquet): Avoid panic on malformed thrift bool fields in parquet metadata #9840 [parquet] ([BoazC-MSFT](http...
arrow 57.3.1
Changelog
57.3.1 (2026-05-07)
Fixed bugs:
- [arrow-buffer] Integer overflow in BufferBuilder::reserve leads to undefined behavior #9897 [arrow]
- [arrow-array] Integer overflow in FixedSizeBinaryArray::value leads to undefined behavior #9898 [arrow]
- [arrow-data] Integer overflow in ArrayData::slice leads to undefined behavior #9899 [arrow]
- [arrow-data] Integer overflow in ArrayData validation leads to undefined behavior #9900 [arrow]
- [arrow-row] Integer overflow in Rows::row index handling leads to undefined behavior #9901 [arrow]
- [arrow-buffer] Integer overflow in BitChunks::new leads to undefined behavior #9903 [arrow]
- [arrow-buffer] Integer overflow in repeat_slice_n_times leads to undefined behavior #9904 [arrow]
Merged pull requests:
- [57_maintenance] Prevent ArrayData::slice length overflow (#9813) #9927 [arrow] (alamb)
- [57_maintenance] Prevent repeat slice length overflow (#9819) #9920 [arrow] (alamb)
- [57_maintenance] Prevent buffer builder length overflow in MutableBuffer::extend_zeros (#9820) #9926 [arrow] (alamb)
- [57_maintenance] Prevent FixedSizeBinaryArray i32 offset overflows (#9872) #9928 [arrow] (alamb)
- [57_maintenance] Prevent ArrayData validation length overflow (#9816) #9925 [arrow] (alamb)
- [57_maintenance] Prevent Rows row index overflow (#9817) #9922 [arrow] (alamb)
- [57_maintenance] Prevent BitChunks length overflow (#9818) #9918 [arrow] (alamb)
arrow 56.2.1
Changelog
56.2.1 (2026-05-07)
Fixed bugs:
- [arrow-buffer] Integer overflow in BufferBuilder::reserve leads to undefined behavior #9897 [arrow]
- [arrow-array] Integer overflow in FixedSizeBinaryArray::value leads to undefined behavior #9898 [arrow]
- [arrow-data] Integer overflow in ArrayData::slice leads to undefined behavior #9899 [arrow]
- [arrow-data] Integer overflow in ArrayData validation leads to undefined behavior #9900 [arrow]
- [arrow-row] Integer overflow in Rows::row index handling leads to undefined behavior #9901 [arrow]
- [arrow-buffer] Integer overflow in BitChunks::new leads to undefined behavior #9903 [arrow]
Merged pull requests:
- [56_maintenance] Prevent ArrayData::slice length overflow (#9813) #9916 [arrow] (alamb)
- [56_maintenance] Prevent FixedSizeBinaryArray i32 offset overflows (#9872) #9917 [arrow] (alamb)
- [56_maintenance] Prevent buffer builder length overflow in MutableBuffer::extend_zeros (#9820) #9915 [arrow] (alamb)
- [56_maintenance] Prevent ArrayData validation length overflow (#9816) #9914 [arrow] (alamb)
- [56_maintenance] Prevent Rows row index overflow (#9817) #9913 [arrow] (alamb)
- [56_maintenance] Prevent BitChunks length overflow (#9818) #9896 [arrow] (alamb)
- [56_maintenance] Fix cargo_audit: Pin cargo-msrv and Cargo.lock in CI #9902 (alamb)
* This Changelog was automatically generated by github_changelog_generator
arrow 58.2.0
Changelog
58.2.0 (2026-04-28)
Implemented enhancements:
- Expose ColumnCloseResult on ArrowColumnChunk #9774 [parquet]
- Expose FFI data structures fields #9771 [arrow]
- short-circuit last predicate in
RowFilterwhenwith_limit(N)is set #9765 [parquet] - vectorise dict-index bounds check #9747 [parquet]
- Refactor
RleEncoder::flush_bit_packed_run#9734 [parquet] - Add benchmark for cast from/to decimals #9728 [arrow]
- Add a security policy for arrow-rs #9727 [parquet] [arrow] [arrow-flight]
- Support
FixedSizeListin arrow-json reader #9714 [arrow] - [Variant] Add
VariantArrayBuilder::append_nullsAPI #9684 - [Json] RunEndEncoded decoder optimization #9645 [arrow]
- [Variant]
variant_get(..., List<_>)non-Struct types support #9615 - [Variant] Add unshredded
Structfast-path forvariant_get(..., Struct)#9596 - Allow setting custom line terminator for CSV writer #9571 [arrow]
- [Variant] Align cast logic for
variant_getto cast kernel for numeric/bool types #9564 [arrow] - ci: use ubuntu-slim where applicable #9536
- Publicly export
arrow_string::Predicateand its methods? #9480 - Don't create CompressionContext when no compression is selected [IPC] #9463 [arrow]
- Parquet: Raw level buffering causes unbounded memory growth for sparse columns #9446 [parquet]
- Parallel Parquet Reading #9381 [parquet]
Fixed bugs:
- [Variant]
unshred_variantpanics on malformed bytes despite returningResult#9740 - RecordBatch::normalize() does not propagate top level null bitmap into the results #9732 [arrow]
- Incorrect accounting in
DictEncoder::estimated_memory_size#9719 [parquet] - arrow-ipc writer does not comply with spec for empty variable-size arrays #9716 [arrow]
- Panic when reading corrupt parquet file with truncated data instead of ParquetError #9705 [parquet]
- NOTICE.txt is inaccurate #9703 [arrow]
- Unnecessary dependency on regex crate #9672
- [arrow-avro] Avro reader produces incorrect results when reader schema and writer schema differ #9655 [arrow]
- parquet docs are broken on docs.rs #9649
- [Parquet] ArrowWriter with CDC panics on nested ListArrays #9637 [parquet] [arrow] [arrow-flight]
- Use release KEYS file for verification instead of dev KEYS #9603
- IPC reader: handling of dictionaries with only null values #9595 [arrow]
- Parquet RleDecoder::get_batch_with_dict panics on oob dictionary indices #9434 [parquet]
Documentation updates:
- docs(variant): link VariantArray doc to official Parquet Variant extension type #9779 (mcharrel)
- Document Security Policy #9730 [parquet] [arrow] [arrow-flight] (alamb)
- Docs: add example of how to read parquet row groups in parallel #9396 [parquet] (alamb)
Performance improvements:
- parquet: avoid decode and heap allocation on terminal skip in DeltaBitPackDecoder #9784 [parquet]
- parquet: O(1) skip for bw=0 miniblocks in DeltaBitPackDecoder #9783 [parquet]
- Remove per-message flush overhead in Arrow IPC writer #9762 [arrow]
- Support
GenericListViewArray::new_uncheckedand refactor ListView json decoder #9646 [arrow] - Support nested REE in arrow-ord
partitionfunction #9640 [arrow] - [Parquet] Remove the BIT_PACKED encoder #9635 [parquet]
- Pre-reserve output capacity in ByteView/ByteArray dictionary decoding #9587 [parquet]
- Fuse RLE decoding and view gathering for StringView dictionary decoding #9582 [parquet]
- Use branchless index clamping and add get_batch_direct to RleDecoder #9581 [parquet]
- Reduce per-byte overhead in VLQ integer decoding #9580 [parquet]
- feat(parquet): batch RLE runs in level encoder via scan-ahead #9830 [parquet] (HippoBaro)
- fix: lazy-init zstd compression contexts to avoid unnecessary FFI calls #9808 [arrow] (mbutrovich)
- parquet: O(1) skip for bw=0 miniblocks in DeltaBitPackDecoder #9786 [parquet] (sahuagin)
- chore: add benchmark for row filters with LIMIT short-circuit #9767 [parquet] (haohuaijin)
- Push
LIMIT/OFFSETinto the lastRowFilterpredicat...
arrow 58.1.0
Changelog
58.1.0 (2026-03-20)
Implemented enhancements:
- Reuse compression dict lz4_block #9566
- [Variant] Add
variant_to_arrowStructtype support #9529 - [Variant] Add
unshred_variantsupport forBinaryandLargeBinarytypes #9526 - [Variant] Add
shred_variantsupport forLargeUtf8andLargeBinarytypes #9525 - [Variant]
variant_gettests clean up #9517 - parquet_variant: Support LargeUtf8 typed value in
unshred_variant#9513 - parquet-variant: Support string view typed value in
unshred_variant#9512 - Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9490 [arrow]
- Followup for support ['fieldName'] in VariantPath #9478
- Speedup DELTA_BINARY_PACKED decoding when bitwidth is 0 #9476 [parquet]
- Support CSV files encoded with charsets other than UTF-8 #9465 [arrow]
- Expose Avro writer schema when building the reader #9460 [arrow]
- Python: avoid importing pyarrow classes ever time #9438
- Add
append_nullstoMapBuilder#9431 [arrow] - Add
append_non_nullstoStructBuilder#9429 [arrow] - Add
append_value_nto GenericByteBuilder #9425 [arrow] - Optimize
from_bitwise_binary_op#9378 [arrow] - Configurable Arrow representation of UTC timestamps for Avro reader #9279 [arrow]
Fixed bugs:
- MutableArrayData::extend does not copy child values for ListView arrays #9561 [arrow]
- ListView interleave bug #9559 [arrow]
- Flight encoding panics with "no dict id for field" with nested dict arrays #9555 [arrow] [arrow-flight]
- "DeltaBitPackDecoder only supports Int32Type and Int64Type" but unsigned types are supported too #9551 [parquet]
- Potential overflow when calling
util::bit_mask::set_bits(soundness issue) #9543 [arrow] - handle Null type in try_merge for Struct, List, LargeList, and Union #9523 [arrow]
- Invalid offset in sparse column chunk data for multiple predicates #9516 [parquet]
- debug_assert_eq! in BatchCoalescer panics in debug mode when batch_size < 4 #9506 [arrow]
- Parquet Statistics::null_count_opt wrongly returns Some(0) when stats are missing #9451 [parquet]
- Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using
RowSelection#9370 [parquet] - Avro schema resolution not properly supported for complex types #9336 [arrow]
Documentation updates:
Performance improvements:
- Introduce
NullBuffer::try_from_unslicedto simplify array construction #9385 [parquet] [arrow] - perf: Coalesce page fetches when RowSelection selects all rows #9578 [parquet] (Dandandan)
- Use chunks_exact for has_true/has_false to enable compiler unrolling #9570 [arrow] (adriangb)
- pyarrow: Cache the imported classes to avoid importing them each time #9439 (Tpt)
Closed issues:
- Duplicate macro definition:
partially_shredded_variant_array_gen#9492 - Enable
LargeList/ListView/LargeListViewforVariantArray::try_new#9455 - Support variables/expressions in record_batch! macro #9245 [arrow]
Merged pull requests:
- [Variant] Add unshred_variant support for Binary and LargeBinary types #9576 (kunalsinghdadhwal)
- [Variant] Add
variant_to_arrowStructtype support #9572 (sdf-jkl) - Make Sbbf Constructers Public #9569 [parquet] (cetra3)
- fix: Used
checked_addfor bounds checks to avoid UB #9568 [arrow] (etseidl) - Add mutable operations to BooleanBuffer (Bit*Assign) #9567 [arrow] (Dandandan)
- chore(deps): update lz4_flex requirement from 0.12 to 0.13 #9565 [parquet] [arrow] (dependabot[bot])
- arrow-select: fix MutableArrayData interleave for ListView #9560 [arrow] (asubiotto)
- Move
ValueIterinto own module, and add publicrecord_countfunction #9557 [arrow] (Rafferty97) - arrow-flight: generate dict_ids for dicts nested inside complex types #9556 [arrow] [arrow-flight] (asubiotto)
- add
shred_variantsupport forLargeUtf8andLargeBinary#9554 (sdf-jkl) - [minor] Download clickbench file when missing #9553 [parquet] (Dandandan)
- DeltaBitPackEncoderConversion: Fix panic message on invalid type #9552 [parquet] ([progval](https://github.c...
arrow 58.0.0
Changelog
58.0.0 (2026-02-19)
Breaking changes:
- Remove support for List types in bit_length kernel #9350 [arrow] (codephage2020)
- Optimize
from_bitwise_unary_op#9297 [arrow] (Dandandan) - Mark
BufferBuilder::new_from_bufferas unsafe #9292 [arrow] (Jefffrey) - [Variant] Support
['fieldName']in VariantPath parser #9276 (klion26) - Remove parquet arrow_cast dependency #9077 [parquet] (tustvold)
- feat: change default behavior for Parquet
PageEncodingStatsto bitmask #9051 [parquet] (WaterWhisperer) - [arrow] Minimize allocation in GenericViewArray::slice() #9016 [arrow] (maxburke)
Implemented enhancements:
- Avoid allocating a
VecinStructBuilder#9427 - Zstd context reuse #9401
- Optimize
from_bitwise_unary_op#9364 - Support
RunEndEncodedin ord comparator #9360 - Support
RunEndEncodedarrays inarrow-json#9359 - Support
BinaryViewinbit_lengthkernel #9351 - Remove support for
Listtypes inbit_lengthkernel #9349 - Support roundtrip
ListViewin parquet arrow writer #9344 - Support
ListViewinlengthkernel #9343 - Support
ListViewin sort kernel #9341 - Add some way to create a Timestamp from a
DateTime#9337 - Introduce
DataType::is_listandDataType::IsBinary#9326 - Performance of creating all null dictionary array can be improved #9321
- [arrow-avro] Add missing Arrow DataType support with
avro_custom_typesround-trip + non-custom fallbacks #9290
Fixed bugs:
- ArrowArrayStreamReader errors on zero-column record batches #9394
- Regression on main (58): Parquet argument error: Parquet error: Required field type_ is missing #9315 [parquet]
Documentation updates:
- Improve safety documentation of the
Arraytrait #9314 [arrow] (alamb) - Improve docs and add build() method to
{Null,Boolean,}BufferBuilder#9155 [arrow] (alamb) - Improve
ArrowReaderBuilder::with_row_filterdocumentation #9153 [parquet] (alamb) - docs: Improve main README.md and highlight community #9119 (alamb)
- Docs: Add additional documentation and example for
make_array#9112 [arrow] (alamb) - doc: fix link on FixedSizeListArray doc #9033 [arrow] (Jefffrey)
Performance improvements:
- Replace
ArrayDatawith direct Array construction #9338 [arrow] (liamzwbao) - Remove some
unsafeand allocations when creating PrimitiveArrays from Vec andfrom_trusted_len_iter#9299 [arrow] (alamb) - parquet: rle skip decode loop when batch contains all max levels (aka no nulls) #9258 [parquet] (lyang24)
- Improve parquet BinaryView / StringView decoder performance (up to -35%) #9236 [parquet] (Dandandan)
- Avoid a clone when creating
BooleanArrayfrom ArrayData #9159 [arrow] (alamb) - Avoid overallocating arrays in coalesce primitives / views #9132 [arrow] (Dandandan)
- perf: Avoid ArrayData allocation in PrimitiveArray::reinterpret_cast #9129 [arrow] (alamb)
- [Parquet] perf: Create StructArrays directly rather than via
ArrayData(1% improvement) #9120 [parquet] [arrow] (alamb) - Avoid clones in
make_arrayforStructArrayandGenericByteViewArray#9114 [arrow] (alamb) - perf: optimize hex decoding in json (1.8x faster in binary-heavy) #9091 [arrow] (Weijun-H)
- Speed up binary kernels (30% faster
andandor), addBooleanBuffer::from_bitwise_binary_op#9090 [arrow] (alamb) - perf: improve field indexing in JSON StructArrayDecoder (1.7x speed up) #9086 [arrow] (Weijun-H)
- bench: added to row_format benchmark conversion of 53 non-nested columns #9081 [arrow] (rluvaton)
- perf: improve calculating length performance for view byte array in row conversion #9080 [arrow] (rluvaton)
- perf: improve calculating length performance for nested arrays in row conversion #9079 [arrow] (rluvaton)
- perf: improve calculating length performance for
GenericByteArrayin row conversion #9078 [arrow] (rluvaton)
Closed issues:
- BatchCoalescer::push_batch panics on schema mismatch instead of returning error #9389
- Release arrow-rs / parquet Minor version
57.3.0(January 2026) #9240 - [Variant] support
..and['fieldName']syntax in the VariantPath parser #9050 - Support Float16 for create_random_array #9028
Merged pull requests:
- Avoid allocating a
VecinStructBuilder[#94...
arrow 57.3.0
Changelog
57.3.0 (2026-02-02)
Breaking changes:
- Revert "Seal Array trait", mark
Arrayasunsafe#9313 (alamb, gabotechs) - Mark
BufferBuilder::new_from_bufferas unsafe #9312 (alamb, Jefffrey)
Fixed bugs:
- Fix string array equality when the values buffer is the same and only the offsets to access it differ #9330 (alamb, jhorstmann)
- Ensure
BufferBuilder::truncatedoesn't overset length #9311 (alamb, Jefffrey) - [parquet] Provide only encrypted column stats in plaintext footer #9310 (alamb, rok, adamreeve)
- [regression] Error with adaptive predicate pushdown: "Invalid offset …" #9309 (alamb, erratic-pattern, sdf-jkl)
arrow 57.2.0
Changelog
57.2.0 (2026-01-07)
Breaking changes:
- Seal Array trait #9092 [arrow] (tustvold)
- [Variant] Unify the CastOptions usage in parquet-variant-compute #8984 (klion26)
Implemented enhancements:
- [parquet] further relax
LevelInfoBuilder::types_compatibleforArrowWriter#9098 - Update arrow-row documentation with Union encoding #9084
- Add code examples for min and max compute functions #9055
- Add
append_nto bytes view builder API #9034 [arrow] - Move
RunArray::get_physical_indicestoRunEndBuffer#9025 [arrow] - Allow quote style in csv writer #9003 [arrow]
- IPC support for ListView #9002 [arrow]
- Implement
BinaryArrayTypefor&FixedSizeBinaryArrays #8992 [arrow] - arrow-buffer: implement num-traits for i256 #8976 [arrow]
- Support for
Arc<str>inParquetRecordWriterderive macro #8972 - [arrow-avro] suggest switching from xz to liblzma #8970 [arrow]
- arrow-buffer: add i256::trailing_zeros #8968 [arrow]
- arrow-buffer: make i256::leading_zeros public #8965 [arrow]
- Add spark like
ignoreLeadingWhiteSpaceandignoreTrailingWhiteSpaceoptions to the csv writer #8961 [arrow] - Add round trip benchmark for Parquet writer/reader #8955 [parquet]
- Support performant
interleavefor List/LargeList #8952 [arrow] - [Variant] Support array access when parsing
VariantPath#8946 - Some panic!s could be represented as unimplemented!s #8932 [arrow]
- [Variant] easier way to construct a shredded schema #8922
- Support
DataType::ListViewandDataType::LargeListViewinArrayData::new_null#8908 [arrow] - Add
GenericListViewArray::from_iter_primitive#8906 [arrow] - [Variant] Unify the cast option usage in ParquentVariant #8873
- Blog post about efficient filter representation in Parquet filter pushdown #8843 [parquet]
- Add comparison support for Union arrays in the
cmpkernel #8837 [arrow] - [Variant] Support array shredding into
List/LargeList/ListView/LargeListView#8830 - Support
Uniondata types for row format #8828 [arrow] - FFI support for ListView #8819 [arrow]
- [Variant] Support more Arrow Datatypes from Variant primitive types #8805
FixedSizeBinaryBuildersupportsappend_array#8750 [arrow]- Implement special case
zipwith scalar for Utf8View #8724 [arrow] - [geometry] Wire up arrow reader/writer for
GEOMETRYandGEOGRAPHY#8717 [parquet]
Fixed bugs:
- Soundness Bug in
try_binarywhenArrayis implemented incorrectly in external crate #9106 - casting
Dict(_, LargeUtf8)toUtf8View(StringViewArray) panics #9101 - wrong results for null count of
nullifkernel #9085 [parquet] [arrow] - Empty first line in some code examples #9063
- GenericByteViewArray::slice is not zero-copy but ought to be #9014
- Regression in struct casting in 57.2.0 (not yet released) #9005 [arrow]
- Fix panic when decoding multiple Union columns in RowConverter #8999 [arrow]
take_fixed_size_binaryDoes Not Consider NULL Indices #8947 [arrow]- [arrow-avro] RecordEncoder Bugs #8934 [arrow]
FixedSizeBinaryArray::try_new(...)Panics with Item Length of Zero #8926 [arrow]cargo test -p arrow-castfails on main #8910 [arrow]GenericListViewArray::new_nullignoreslenand returns an empty array #8904 [arrow]FixedSizeBinaryArray::new_nullDoes Not Properly Set the Length of the Values Buffer #8900 [arrow]- Struct casting requires same order of fields #8870 [arrow]
- Cannot cast string dictionary to binary view #8841 [arrow]
Documentation updates:
- Add Union encoding documentation #9102 [arrow] (EduardAkhmetshin)
- docs: fix misleading reserve documentation #9076 (WaterWhisperer)
- Fix headers and empty lines in code examples #9064 (EduardAkhmetshin)
- Add examples for min and max functions #9062 (EduardAkhmetshin)
- Improve arrow-buffer documentation #9020 [arrow] (alamb)
- Move examples in arrow-csv to docstrings, polish up docs #9001 [arrow] (alamb)
- Add example of parsing field names as VariantPath #8945 (alamb)
- Improve documentation for `prep_null...
arrow 57.1.0
Changelog
57.1.0 (2025-11-20)
Implemented enhancements:
- Eliminate bound checks in filter kernels #8865 [arrow]
- Respect page index policy option for ParquetObjectReader when it's not skip #8856 [parquet]
- Speed up collect_bool and remove
unsafe#8848 [arrow] - Error reading parquet FileMetaData with empty lists encoded as element-type=0 #8826 [parquet]
- ValueStatistics methods can't be used from generic context in external crate #8823 [parquet]
- Custom Pretty-Printing Implementation for Column when Formatting Record Batches #8821 [arrow]
- Parquet-concat: supports bloom filter and page index #8804 [parquet]
- [Parquet] virtual row group number support #8800
- [Variant] Enforce shredded-type validation in
shred_variant#8795 [arrow] - Simplify decision logic to call
FilterBuilder::optimizeor not #8781 [arrow] - [Variant] Add variant to arrow for DataType::{Binary, LargeBinary, BinaryView} #8767 [arrow]
- Provide algorithm that allows zipping arrays whose values are not prealigned #8752 [arrow]
- [Parquet] ParquetMetadataReader decodes too much metadata under point-get scenerio #8751 [parquet]
arrow-jsonsupports encoding binary arrays, but not decoding #8736 [arrow]- Allow
FilterPredicateinstances to be reused for RecordBatches #8692 [arrow] - ArrowJsonBatch::from_batch is incomplete #8684 [arrow]
- parquet-layout: More info about layout including footer size, page index, bloom filter? #8682 [parquet]
- Rewrite
ParquetRecordBatchStream(async API) in terms of the PushDecoder #8677 [parquet] - [JSON] Add encoding for binary view #8674 [arrow]
- Refactor arrow-cast decimal casting to unify the rescale logic used in Parquet variant casts #8670 [arrow]
- [Variant] Support Uuid/
FixedSizeBinary(16)shredding #8665 - [Parquet]There should be an encoding counter to know how many encodings the repo supports in total #8662 [parquet]
- Improve
parse_data_typeforList,ListView,LargeList,LargeListView,FixedSizeList,Union,Map,RunEndCoded. #8648 [arrow] - [Variant] Support variant to arrow primitive support null/time/decimal_* #8637
- Return error from
RleDecoder::resetrather than panic #8632 [parquet] - Add bitwise ops on
BooleanBufferBuilderandMutableBufferthat mutate directly the buffer #8618 [arrow] - [Variant] Add variant_to_arrow Utf-8, LargeUtf8, Utf8View types support #8567 [arrow]
Fixed bugs:
- Regression: Parsing
List(Int64)results in nullable list in 57.0.0 and a non-nullable list in 57.1.0 #8883 - Regression: FixedSlizeList data type parsing fails on 57.1.0 #8880
- (dyn ArrayFormatterFactory + 'static) can't be safely shared between threads #8875
- RowNumber reader has wrong row group ordering #8864 [parquet]
ThriftMetadataWriter::write_column_indexescannot handle aColumnIndexMetaData::NONE#8815 [parquet]- "Archery test With other arrows" Integration test failing on main: #8813 [arrow]
- [Parquet] Writing in 57.0.0 seems 10% slower than 56.0.0 #8783 [parquet]
- Parquet reader cannot handle files with unknown logical types #8776 [parquet]
- zip now treats nulls as false in provided mask regardless of the underlying bit value #8721 [arrow]
- [avro] Incorrect version in crate.io landing page #8691 [arrow]
- Array: ViewType gc() has bug when array sum length exceed i32::MAX #8681 [arrow]
- Parquet 56: encounter
error: item_reader def levels are Nonewhen reading nested field with row filter #8657 [parquet] - Degnerate and non-nullable
FixedSizeListArrays are not handled #8623 [arrow] - [Parquet]Performance Degradation with RowFilter on Unsorted Columns due to Fragmented ReadPlan #8565 [parquet]
Documentation updates:
- docs: Add example for creating a
MutableBufferfromBuffer#8853 [arrow] (alamb) - docs: Add examples for creating MutableBuffer from Vec #8852 [arrow] (alamb)
- Improve ParquetDecoder docs #8802 [parquet] (alamb)
- Update docs for zero copy conversion of ScalarBuffer #8772 [arrow] (alamb)
- Add example to convert
PrimitiveArrayto aVec#8771 [arrow] (alamb) - docs: Add links for arrow-avro #8770 [arrow] (alamb)
- [Parquet] Minor: Update comments in page decompressor #8764 [parquet] (alamb)
- Document limitations of the `arrow_integratio...
arrow 57.0.0
Changelog
57.0.0 (2025-10-19)
Breaking changes:
- Use
Arc<FileEncryptionProperties>everywhere to be be consistent withFileDecryptionProperties#8626 [parquet] (alamb) - feat: Improve DataType display for
RunEndEncoded#8596 [arrow] (Weijun-H) - Add
ArrowError::AvroError, remaining types and roundtrip tests toarrow-avro, #8595 [arrow] (jecsand838) - [thrift-remodel] Refactor Thrift encryption and store encodings as bitmask #8587 [parquet] (etseidl)
- feat: Enhance
Mapdisplay formatting in DataType #8570 [arrow] (Weijun-H) - feat: Enhance DataType display formatting for
ListViewandLargeListViewvariants #8569 [arrow] (Weijun-H) - Use custom thrift parser for parquet metadata (phase 1 of Thrift remodel) #8530 [parquet] (etseidl)
- refactor: improve display formatting for Union #8529 [arrow] (Weijun-H)
- Use
Arc<FileDecryptionProperties>to reduce size of ParquetMetadata and avoid copying whenencryptionis enabled #8470 [parquet] (alamb) - Fix for column name based projection mask creation #8447 [parquet] (etseidl)
- Improve Display formatting of DataType::Timestamp #8425 [parquet] [arrow] (emilk)
- Use more compact Debug formatting of Field #8424 [arrow] (emilk)
- Reuse zstd compression context when writing IPC #8405 [arrow] [arrow-flight] (albertlockett)
- [Decimal] Add scale argument to validation functions to ensure accurate error logging #8396 [arrow] (Weijun-H)
- Quote
DataType::Structfield names inDisplayformatting #8291 [parquet] [arrow] (emilk) - Improve
DisplayforDataTypeandField#8290 [parquet] [arrow] (emilk) - Bump pyo3 to 0.26.0 #8286 (mbrobbel)
Implemented enhancements:
- Added Avro support (new
arrow-avrocrate) #4886 - parquet-rewrite: supports compression level and write batch size #8639
- Error not panic when int96 stastistics aren't size 12 #8614 [parquet]
- [Variant] Make
VariantArrayiterable #8612 - [Variant] impl
PartialEqforVariantArray#8610 - [Variant] Remove potential panics when probing
VariantArray#8609 - [Variant] Remove ceremony of going from list of
VarianttoVariantArray#8606 - Eliminate redundant validation in
RecordBatch::project#8591 [arrow] - [PARQUET][BENCH] Arrow writer bench with compression and/or page v2 #8559 [parquet]
- [Variant] casting functions are confusingly named #8531 [parquet]
- Support writing GeospatialStatistics in Parquet writer #8523 [parquet]
- [thrift-remodel] Optimize
convert_row_groups#8517 [parquet] - [Variant] Add variant to arrow primitive support for boolean/timestamp/time #8515
- Test
thrift-remodelbranch with DataFusion #8513 [parquet] - Make
UnionArray::is_denseMethod Public #8503 [arrow] - Add
append_nmethod toFixedSizeBinaryDictionaryBuilder#8497 [arrow] - [Parquet] Reduce size of ParquetMetadata when encryption feature is enabled #8469 [parquet]
- [Parquet] Remove useless mut requirements in geting bloom filter function #8461 [parquet]
- Change
serdedependency toserde_corewhere applicable #8451 [arrow] - [Parquet] Split
ParquetMetadataReaderinto IO/decoder state machine and thrift parsing #8439 [parquet] - Remove compiler warning for redundant config enablement #8412 [arrow]
- Add geospatial statistics creation support for GEOMETRY/GEOGRAPHY Parquet logical types #8411 [arrow]
arrow_jsonlackswith_timestamp_formatfunctions likearrow_csvhad offered #8398 [arrow]- Unify API for writing column chunks / row groups in parallel #8389 [parquet]
- Reuse zstd context in arrow IPC writer #8386 [arrow] [arrow-flight]
- [Variant] Support reading/writing Parquet Variant LogicalType #8370 [parquet]
- [Variant] Implement a
shred_variantfunction #8361 - [Parquet] Expose ReadPlan and ReadPlanBuilder #8347 [parquet]
- [Variant] [Shredding] Support typed_access for
List#8337 [parquet] - [Variant] [Shredding] Support typed_access for
Struct#8336 [[parquet](https://github.com/apache/arrow-rs/labels/parque...