Skip to content

[arrow-string]: add like::eq_ascii_ignore_case kernel#9871

Merged
alamb merged 2 commits into
apache:mainfrom
albertlockett:albert/arrow-string-eq-ignore-ascii-case
May 6, 2026
Merged

[arrow-string]: add like::eq_ascii_ignore_case kernel#9871
alamb merged 2 commits into
apache:mainfrom
albertlockett:albert/arrow-string-eq-ignore-ascii-case

Conversation

@albertlockett
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Users might wish to have a mechanism to do check if two strings are equal in a case-insensitive way. We already have some machinery to do this in the arrow-string crate (in the predicate module). This PR exposes a function so it can be invoked easily.

What changes are included in this PR?

Adds a function like::eq_ascii_ignore_case to the arrow-string crate that performs a case-insensitive equals check on two string arrays.

Are these changes tested?

yes unit tests

Are there any user-facing changes?

Yes this function is public

@github-actions github-actions Bot added the arrow Changes to the arrow crate label May 1, 2026
Comment thread arrow-string/src/like.rs
/// let result = eq_ignore_ascii_case(&strings, &patterns).unwrap();
/// assert_eq!(result, BooleanArray::from(vec![true, true, true, false]));
/// ```
pub fn eq_ignore_ascii_case(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping like is the correct module to add this function?

I added it here because I wanted to reuse a lot of the machinery that was already in place to invoke the Predicate, but I was hesitating because this does not have an equivalent SQL function, unlike the other functions/Operator variants in this module.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine

The other potential option would be in the eq module of arrow-ord -- but that doesn't have string stuff, so I think this is actually better

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can change the module doc to be more general:

from

  //! Provide SQL's LIKE operators for Arrow's string arrays

to something like:

  //! String predicate kernels for Arrow arrays.
  //!
  //! Provides SQL `LIKE`/`ILIKE` kernels as well as related
  //! string predicates such as `contains`, `starts_with`, `ends_with`, and
  //! ASCII case-insensitive equality.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, updated in 1f3b03c

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @albertlockett -- I have some comment / test suggestions but I also think we could do them as follow ons too

Comment thread arrow-string/src/like.rs
/// let result = eq_ignore_ascii_case(&strings, &patterns).unwrap();
/// assert_eq!(result, BooleanArray::from(vec![true, true, true, false]));
/// ```
pub fn eq_ignore_ascii_case(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine

The other potential option would be in the eq module of arrow-ord -- but that doesn't have string stuff, so I think this is actually better

Comment thread arrow-string/src/like.rs Outdated
like_op(Op::Contains, left, right)
}

/// Perform equality check on two byte arrays using an ASCII case-insensitive match.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can also be string / stringview / largestring arrays too I think

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. fixed in 1f3b03c

"byte array" probably isn't the right wording here. I removed the word "byte" to avoid confusion, and the rust doc also lists the types.

Unrelated question - wondering if e should support large string view in these functions as well?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated question - wondering if e should support large string view in these functions as well?

Eventually yes. That would be a good follow on PR / issue perhaps

Comment thread arrow-string/src/like.rs Outdated

test_utf8!(
test_utf8_array_eq_ignore_ascii_case,
vec!["arrow", "arrow", "arrow", "parquet", "parquet"],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add a substring test too -- like comparing arrow to arro and verify they don't match

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, added in 1f3b03c

Comment thread arrow-string/src/like.rs
/// let result = eq_ignore_ascii_case(&strings, &patterns).unwrap();
/// assert_eq!(result, BooleanArray::from(vec![true, true, true, false]));
/// ```
pub fn eq_ignore_ascii_case(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can change the module doc to be more general:

from

  //! Provide SQL's LIKE operators for Arrow's string arrays

to something like:

  //! String predicate kernels for Arrow arrays.
  //!
  //! Provides SQL `LIKE`/`ILIKE` kernels as well as related
  //! string predicates such as `contains`, `starts_with`, `ends_with`, and
  //! ASCII case-insensitive equality.

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @albertlockett

@alamb alamb merged commit 5bf8912 into apache:main May 6, 2026
18 checks passed
Rich-T-kid pushed a commit to Rich-T-kid/arrow-rs that referenced this pull request Jun 2, 2026
# Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax.
-->

- Closes apache#9870

# Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Users might wish to have a mechanism to do check if two strings are
equal in a case-insensitive way. We already have some machinery to do
this in the arrow-string crate (in the `predicate` module). This PR
exposes a function so it can be invoked easily.

# What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Adds a function `like::eq_ascii_ignore_case` to the arrow-string crate
that performs a case-insensitive equals check on two string arrays.

# Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?

If this PR claims a performance improvement, please include evidence
such as benchmark results.
-->

yes unit tests


# Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.

If there are any breaking changes to public APIs, please call them out.
-->

Yes this function is public
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose eq ignore ascii case from arrow-string

2 participants