[feature](hive)support hive catalog read json table. (#43469)#44848
Merged
morningman merged 1 commit intoapache:branch-3.0from Dec 4, 2024
Merged
[feature](hive)support hive catalog read json table. (#43469)#44848morningman merged 1 commit intoapache:branch-3.0from
morningman merged 1 commit intoapache:branch-3.0from
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Problem Summary:
Support reading json format hive table like:
```mysql
mysql> show create table basic_json_table;
CREATE TABLE `basic_json_table`(
`id` int,
`name` string,
`age` tinyint,
`salary` float,
`is_active` boolean,
`join_date` date,
`last_login` timestamp,
`height` double,
`profile` binary,
`rating` decimal(10,2))
ROW FORMAT SERDE
'org.apache.hive.hcatalog.data.JsonSerDe'
```
Behavior changed:
To implement this feature, this pr modifies `new_json_reader`.
Previously, `new_json_reader` could only insert data into columnString.
In order to support inserting data into columns of other types,
`DataTypeSerDe` is introduced to insert data into columns. To maintain
compatibility with previous versions, changes to this pr are triggered
only when reading hive json tables.
Limitation of Use:
1. Currently, only query is supported, and writing is not supported.
2. Currently, only the `ROW FORMAT SERDE
'org.apache.hive.hcatalog.data.JsonSerDe';` scenario is supported. For
some properties specified in `with serdeproperties`, Doris does not take
effect.
3. Since Hive does not allow columns with the same name but different
case when creating a table in Json format (including inside a Struct),
we convert the field names in the Json data to lowercase when reading
the Json data file, and then match according to the lowercase field
names. For field names that are duplicated after being converted to
lowercase in the data, the value of the last field is used (consistent
with Hive behavior).
example:
```
create table json_table(
column int
)ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
a.json:
{"column":1,"COLumn",2,"COLUMN":3}
{"column":10,"COLumn",20}
{"column":100}
in Hive : load a.json to table json_table
in Doris query:
---
3
20
100
---
```
Todo(in next pr):
Merge `serde` and `json_reader` ,because they have logical conflicts.
Hive catalog support read json format table.
cd62ee5 to
a21926b
Compare
Contributor
Author
|
run buildall |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
bp #43469
Problem Summary:
Support reading json format hive table like:
Behavior changed:
To implement this feature, this pr modifies
new_json_reader. Previously,new_json_readercould only insert data into columnString. In order to support inserting data into columns of other types,DataTypeSerDeis introduced to insert data into columns. To maintain compatibility with previous versions, changes to this pr are triggered only when reading hive json tables.Limitation of Use:
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';scenario is supported. For some properties specified inwith serdeproperties, Doris does not take effect.example:
Todo(in next pr):
Merge
serdeandjson_reader,because they have logical conflicts.Hive catalog support read json format table.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)