Skip to content

table.scan queries failing sometimes when result is empty #992

@jurossiar

Description

@jurossiar

Apache Iceberg version

0.7.0 (latest release)

Please describe the bug 🐞

Testing the new functionalities of pyiceberg 0.7.0 I found that some queries using table.scan through an error when the result is empty sometimes.

ResolveError: Field is required, and could not be found in the file: 1: table_id: required string
image

This doesn't happen with pyiceberg 0.6.0 (the version that we are currently using).

I was able to reproduce the issue consistently with the attached example. I think that this happens with tables with identifier fields (are the cases where I found the issue).

example.zip
Contains:
requirements.txt -> with the 4 dependencies required.
example_scan_error.ipynb -> jupiter notebook where you can see the issue and reproduce it.
.env.example: properties to configure.

See video running the same queries with a conda environment with pyiceberg 0.7.0 and pyiceberg 0.6.0 (created with the same requirements.txt just chaning the pyceberg version to 0.6.0).
https://github.com/user-attachments/assets/9d21d247-e754-4830-9fe0-55ec4c960319

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions