Skip to content

[Python] pyarrow.concat_arrays segfaults if a resulting StringArray's capacity overflows #26180

@asfimport

Description

@asfimport

I'm sorry if this was already reported, but there's an overflow issue in concatenation of large arrays

In [1]: import pyarrow as pa

In [2]: str_array = pa.array(['a' * 128] * 10**8)

In [3]: large_array = pa.concat_arrays([str_array] * 50)
Segmentation fault (core dumped)

I suppose that this should be handled by upcast to large_string.

Reporter: Artem KOZHEVNIKOV / @artemru

Related issues:

Note: This issue was originally created as ARROW-10172. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions