Skip to content

[branch-50] Fix ambiguous column names in substrait conversion #17299#18077

Merged
alamb merged 2 commits into
apache:branch-50from
hareshkh:hareshkh/ambiguous-col-ref-fix
Oct 19, 2025
Merged

[branch-50] Fix ambiguous column names in substrait conversion #17299#18077
alamb merged 2 commits into
apache:branch-50from
hareshkh:hareshkh/ambiguous-col-ref-fix

Conversation

@hareshkh

@hareshkh hareshkh commented Oct 15, 2025

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Fix ambiguous column names in substrait conversion as a result of literals before and after a join being assigned the same name.

More information in the issue, but say you have a NULL literal before a join called "column1" and then you create a new NULL column after the join called column2, you would get an ambiguous column name errors like.

Error: SchemaError(AmbiguousReference { field: Column { relation: Some(Bare { table: "left" }), name: "UTF8(NULL)" } }, Some(""))

What changes are included in this PR?

Simply alias all literals as they're converted to have a UUID name.

Are these changes tested?

Yes.
Tested by using substrait-java with this query

./isthmus-cli/build/graal/isthmus --create "create table A (a int); create table B (a int, c int); create table C (a int, d int)" "select t.*, C.d, CAST(NULL AS VARCHAR) as e from (select a, CAST(NULL AS VARCHAR) as c from A UNION ALL select a, c from B) t LEFT JOIN C ON t.a = C.a"

Are there any user-facing changes?

…erals having the same name during conversion. (apache#17299)

* Fix ambigious column names in substrate conversion as a result of literals having the same names

* move it to the project

* do it only for projects

* comment

* fmt

* comments

---------

Co-authored-by: Xander Bailey <xbailey@palantir.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@github-actions github-actions Bot added the substrait Changes to the substrait crate label Oct 15, 2025
@alamb alamb changed the title [Branch-50] Fix ambiguous column names in substrait conversion #17299 [branch-50] Fix ambiguous column names in substrait conversion #17299 Oct 15, 2025

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @hareshkh

@hareshkh

Copy link
Copy Markdown
Contributor Author

@alamb - Can you please rerun the CI again? The cargo examples task seems to be stuck in the latest run.

@alamb

alamb commented Oct 17, 2025

Copy link
Copy Markdown
Contributor

@alamb - Can you please rerun the CI again? The cargo examples task seems to be stuck in the latest run.

I did and sadly it seems the tests are still failing:

I can try and look at them in the next day or two if no one beats me to it

@alamb

alamb commented Oct 19, 2025

Copy link
Copy Markdown
Contributor

@comphead fixed the tests, so merging up to get those fixes

@alamb alamb merged commit e3f8e37 into apache:branch-50 Oct 19, 2025
30 checks passed
@alamb

alamb commented Oct 19, 2025

Copy link
Copy Markdown
Contributor

Thanks again @hareshkh

LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Oct 20, 2025
…#17299 (apache#18077)

## Which issue does this PR close?

- Related to apache#18072
- Related to apache#17294
- Backports apache#17299 to branch-50

## Rationale for this change

Fix ambiguous column names in substrait conversion as a result of
literals before and after a join being assigned the same name.

More information in the issue, but say you have a NULL literal before a
join called "column1" and then you create a new NULL column after the
join called column2, you would get an ambiguous column name errors like.

```
Error: SchemaError(AmbiguousReference { field: Column { relation: Some(Bare { table: "left" }), name: "UTF8(NULL)" } }, Some(""))
```

## What changes are included in this PR?

Simply alias all literals as they're converted to have a UUID name.

## Are these changes tested?

Yes.
Tested by using substrait-java with this query

```
./isthmus-cli/build/graal/isthmus --create "create table A (a int); create table B (a int, c int); create table C (a int, d int)" "select t.*, C.d, CAST(NULL AS VARCHAR) as e from (select a, CAST(NULL AS VARCHAR) as c from A UNION ALL select a, c from B) t LEFT JOIN C ON t.a = C.a"
```

## Are there any user-facing changes?

Co-authored-by: Xander <zander181@googlemail.com>
Co-authored-by: Xander Bailey <xbailey@palantir.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit e3f8e37)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants