branch-3.1:[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding (#57208)#57614

hubgeter · 2025-11-03T03:15:37Z

bp #57208
Problem Summary:
When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses memcpy. However, for INT32, INT64, etc., direct assignment is faster than memcpy.

In Parquet dictionary encoding, the actual data is not stored contiguously, resulting in very small memcpy sizes. When analyzing the implementation of memcpy, we can see that for such small sizes, __builtin_memcpy is used instead. The implementation of __builtin_memcpy essentially behaves like a series of simple assignments. You can observe the corresponding assembly code here: https://godbolt.org/z/r9Ma1ozvd.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

… decode RLE_DICTIONARY encoding (apache#57208) Problem Summary: When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses memcpy. However, for INT32, INT64, etc., direct assignment is faster than memcpy. In Parquet dictionary encoding, the actual data is not stored contiguously, resulting in very small memcpy sizes. When analyzing the implementation of `memcpy`, we can see that for such small sizes, `__builtin_memcpy` is used instead. The implementation of `__builtin_memcpy` essentially behaves like a series of simple assignments. You can observe the corresponding assembly code here: https://godbolt.org/z/r9Ma1ozvd.

hello-stephen · 2025-11-03T03:15:44Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

hubgeter · 2025-11-03T03:15:57Z

run buildall

hubgeter requested a review from morrySnow as a code owner November 3, 2025 03:15

hubgeter changed the title ~~[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding (#57208)~~ branch-3.1:[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding (#57208) Nov 3, 2025

morrySnow approved these changes Nov 4, 2025

View reviewed changes

morningman merged commit 873d39e into apache:branch-3.1 Nov 4, 2025
22 of 23 checks passed

morrySnow mentioned this pull request Nov 13, 2025

3.1.3 Release Notes #57980

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

branch-3.1:[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding (#57208)#57614