[fix](csv) escape quote with double quote for csv format table#50101
Merged
morningman merged 17 commits intoapache:masterfrom Jun 4, 2025
Merged
[fix](csv) escape quote with double quote for csv format table#50101morningman merged 17 commits intoapache:masterfrom
morningman merged 17 commits intoapache:masterfrom
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 34145 ms |
TPC-DS: Total hot run time: 192379 ms |
ClickBench: Total hot run time: 29.5 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 34185 ms |
TPC-DS: Total hot run time: 191935 ms |
ClickBench: Total hot run time: 29.78 s |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
morningman
pushed a commit
that referenced
this pull request
Apr 18, 2025
a3537ff to
5daf795
Compare
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 34096 ms |
TPC-DS: Total hot run time: 193103 ms |
ClickBench: Total hot run time: 29.76 s |
Contributor
Author
|
run buildall |
2 similar comments
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 34089 ms |
TPC-DS: Total hot run time: 185219 ms |
ClickBench: Total hot run time: 29.21 s |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
BE Regression P0 && UT Coverage ReportIncrement line coverage Increment coverage report
|
66de029 to
dacd880
Compare
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 33879 ms |
Contributor
|
PR approved by anyone and no changes requested. |
BePPPower
approved these changes
Jun 4, 2025
wuwenchi
approved these changes
Jun 4, 2025
hubgeter
approved these changes
Jun 4, 2025
suxiaogang223
added a commit
to suxiaogang223/doris
that referenced
this pull request
Jun 27, 2025
…e#50101) Problem Summary: According to the CSV standard format definition, quote characters inside a string should be escaped using a pair of quote characters. However, the current implementation does not handle this case correctly, which may lead to incorrect parsing results when the input string contains quote characters.
suxiaogang223
added a commit
to suxiaogang223/doris
that referenced
this pull request
Jun 27, 2025
…e#50101) Problem Summary: According to the CSV standard format definition, quote characters inside a string should be escaped using a pair of quote characters. However, the current implementation does not handle this case correctly, which may lead to incorrect parsing results when the input string contains quote characters.
dataroaring
pushed a commit
that referenced
this pull request
Jun 28, 2025
suxiaogang223
added a commit
to suxiaogang223/doris
that referenced
this pull request
Jun 30, 2025
…e#50101) Problem Summary: According to the CSV standard format definition, quote characters inside a string should be escaped using a pair of quote characters. However, the current implementation does not handle this case correctly, which may lead to incorrect parsing results when the input string contains quote characters.
koarz
pushed a commit
to koarz/doris
that referenced
this pull request
Jul 3, 2025
16 tasks
morningman
pushed a commit
that referenced
this pull request
Nov 6, 2025
### What problem does this PR solve? Fix wrong result when escape same as enclose, introduced by #50101 data: ``` 50,"{""a"": 1}" 60,"{""a"": 2}" ``` query ``` select * from local( "backend_id" = "1760087225568", "file_path" = "test.csv", "format" = "csv", "column_separator" = ",", "enclose" = "\"", "escape" = "\"" ``` expectation: ``` +------+------------+ | k1 | k2 | +------+------------+ | 50 | {"a": 1} | | 60 | {"a": 2} | +------+------------+ ``` real: ``` +------+------------------------+ | k1 | k2 | +------+------------------------+ | 50 | {"a": 1} 60,{"a": 2} | +------+------------------------+ ```
github-actions bot
pushed a commit
that referenced
this pull request
Nov 6, 2025
### What problem does this PR solve? Fix wrong result when escape same as enclose, introduced by #50101 data: ``` 50,"{""a"": 1}" 60,"{""a"": 2}" ``` query ``` select * from local( "backend_id" = "1760087225568", "file_path" = "test.csv", "format" = "csv", "column_separator" = ",", "enclose" = "\"", "escape" = "\"" ``` expectation: ``` +------+------------+ | k1 | k2 | +------+------------+ | 50 | {"a": 1} | | 60 | {"a": 2} | +------+------------+ ``` real: ``` +------+------------------------+ | k1 | k2 | +------+------------------------+ | 50 | {"a": 1} 60,{"a": 2} | +------+------------------------+ ```
github-actions bot
pushed a commit
that referenced
this pull request
Nov 6, 2025
### What problem does this PR solve? Fix wrong result when escape same as enclose, introduced by #50101 data: ``` 50,"{""a"": 1}" 60,"{""a"": 2}" ``` query ``` select * from local( "backend_id" = "1760087225568", "file_path" = "test.csv", "format" = "csv", "column_separator" = ",", "enclose" = "\"", "escape" = "\"" ``` expectation: ``` +------+------------+ | k1 | k2 | +------+------------+ | 50 | {"a": 1} | | 60 | {"a": 2} | +------+------------+ ``` real: ``` +------+------------------------+ | k1 | k2 | +------+------------------------+ | 50 | {"a": 1} 60,{"a": 2} | +------+------------------------+ ```
wyxxxcat
pushed a commit
to wyxxxcat/doris
that referenced
this pull request
Nov 18, 2025
…e#57632) ### What problem does this PR solve? Fix wrong result when escape same as enclose, introduced by apache#50101 data: ``` 50,"{""a"": 1}" 60,"{""a"": 2}" ``` query ``` select * from local( "backend_id" = "1760087225568", "file_path" = "test.csv", "format" = "csv", "column_separator" = ",", "enclose" = "\"", "escape" = "\"" ``` expectation: ``` +------+------------+ | k1 | k2 | +------+------------+ | 50 | {"a": 1} | | 60 | {"a": 2} | +------+------------+ ``` real: ``` +------+------------------------+ | k1 | k2 | +------+------------------------+ | 50 | {"a": 1} 60,{"a": 2} | +------+------------------------+ ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Problem Summary:
According to the CSV standard format definition, quote characters inside a string should be escaped using a pair of quote characters. However, the current implementation does not handle this case correctly, which may lead to incorrect parsing results when the input string contains quote characters.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)