branch-4.0: [feature](catalog) support varbinary type mapping in hive/iceberg/paimon table (#57821)#58482
Merged
yiguolei merged 3 commits intoapache:branch-4.0from Nov 28, 2025
Merged
Conversation
…mon table (apache#57821) Problem Summary: support varbinary type in hive/iceberg/paimon table, could mapping varbinary type into doris directly, not of use string type, could use catalog properties enable.mapping.varbinary control it, and default is false. and TVF function, eg HDFS also have param could control, and default is false. 1. when parquet file column type is tparquet::Type::BYTE_ARRAY and no logicalType and converted_type,read it to column_varbianry directly. so both physical convert and logical convert are consistent. if tparquet::Type::BYTE_ARRAY and have set logicalType, eg String, so those will be reading as column_string, and if the table column create as binary column, so VarBinaryConverter used convert column_string to column_varbinary. 2. when orc file column is binary type, also mapping to varbinary type directly, and could reuse StringVectorBatch. 3. add cast between string and varbinary type. 4. mapping UUID to binary type instead of string in iceberg . 5. change the bool safe_cast_string(**const char\* startptr, size_t buffer_size**, xxx) signature to safe_cast_string(**const StringRef& str_ref**, xxx). 6. add **const** to read_date_text_impl function. 7. add some test with paimon catalog test varbinary, will add more case for hive/iceberg and update doc. ``` mysql> show create table binary_demo3; +--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | binary_demo3 | CREATE TABLE `binary_demo3` ( `id` int NULL, `record_name` char(10) NULL, `vrecord_name` text NULL, `bin` varbinary(10) NULL, `varbin` varbinary(2147483647) NULL ) ENGINE=PAIMON_EXTERNAL_TABLE LOCATION 'file:/mnt/disk2/zhangsida/test_paimon/demo.db/binary_demo3' PROPERTIES ( "path" = "file:/mnt/disk2/zhangsida/test_paimon/demo.db/binary_demo3", "primary-key" = "id" ); | +--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> select *, length(record_name),length(vrecord_name),length(bin),length(varbin) from binary_demo3; +------+-------------+--------------+------------------------+----------------+---------------------+----------------------+-------------+----------------+ | id | record_name | vrecord_name | bin | varbin | length(record_name) | length(vrecord_name) | length(bin) | length(varbin) | +------+-------------+--------------+------------------------+----------------+---------------------+----------------------+-------------+----------------+ | 1 | AAAA | AAAA | 0xAAAA0000000000000000 | 0xAAAA | 10 | 4 | 10 | 2 | | 2 | 6161 | 6161 | 0x61610000000000000000 | 0x6161 | 10 | 4 | 10 | 2 | | 3 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +------+-------------+--------------+------------------------+----------------+---------------------+----------------------+-------------+----------------+ ``` support varbinary type mapping in hive/iceberg/paimon table
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
Author
|
run buildall |
Contributor
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
yiguolei
approved these changes
Nov 28, 2025
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Problem Summary:
cherry-pick from (#57821)
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)