Skip to content

[fix](outfile) fix analysis error when specifying parquet schema#57232

Merged
morningman merged 1 commit intoapache:masterfrom
morningman:outfile_parquet2
Oct 22, 2025
Merged

[fix](outfile) fix analysis error when specifying parquet schema#57232
morningman merged 1 commit intoapache:masterfrom
morningman:outfile_parquet2

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Oct 22, 2025

What problem does this PR solve?

Related PR: #33016

Introduced from #33016, when specify the "schema" property in outfile clause with parquet format,
it will return error:

Parquet schema number does not equal to select item number

This is because we wrongly analyze OutfileClause twice.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Oct 22, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 189827 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0dbf100383b0c280b2fa5ef28806cc3067f87851, data reload: false

query1	1056	446	413	413
query2	6624	1721	1651	1651
query3	6758	235	222	222
query4	26413	23906	23362	23362
query5	4849	627	483	483
query6	363	245	240	240
query7	4647	491	308	308
query8	307	279	284	279
query9	8739	2571	2582	2571
query10	506	338	275	275
query11	15291	15017	14749	14749
query12	192	113	116	113
query13	1679	539	437	437
query14	11270	9212	9449	9212
query15	217	184	183	183
query16	7677	662	513	513
query17	1270	768	609	609
query18	2028	426	355	355
query19	223	211	191	191
query20	147	143	130	130
query21	228	145	126	126
query22	4691	4642	4588	4588
query23	35145	33897	33868	33868
query24	8610	2441	2486	2441
query25	583	551	554	551
query26	1390	294	163	163
query27	2914	526	360	360
query28	4812	2256	2196	2196
query29	891	626	576	576
query30	318	240	207	207
query31	960	853	800	800
query32	84	73	74	73
query33	622	372	357	357
query34	849	856	590	590
query35	976	885	784	784
query36	985	1008	920	920
query37	129	124	93	93
query38	3557	3502	3456	3456
query39	1458	1407	1403	1403
query40	220	125	117	117
query41	60	60	60	60
query42	119	111	111	111
query43	488	489	459	459
query44	1228	744	743	743
query45	188	177	174	174
query46	886	985	634	634
query47	1762	1796	1692	1692
query48	400	427	326	326
query49	773	496	416	416
query50	643	688	408	408
query51	3881	3965	3914	3914
query52	105	113	108	108
query53	238	261	202	202
query54	598	581	538	538
query55	94	87	82	82
query56	329	302	314	302
query57	1219	1205	1122	1122
query58	294	286	268	268
query59	2528	2613	2529	2529
query60	340	335	322	322
query61	155	147	147	147
query62	795	732	679	679
query63	227	190	192	190
query64	4439	1181	914	914
query65	4021	3972	3974	3972
query66	1088	435	343	343
query67	15204	14881	15009	14881
query68	8231	869	596	596
query69	505	345	299	299
query70	1351	1234	1246	1234
query71	432	338	305	305
query72	5746	4927	4873	4873
query73	633	585	353	353
query74	8885	8988	8926	8926
query75	3318	3388	2802	2802
query76	3296	1157	730	730
query77	517	387	331	331
query78	9383	9923	8794	8794
query79	2748	791	585	585
query80	687	563	493	493
query81	515	267	293	267
query82	218	161	129	129
query83	271	277	240	240
query84	260	109	99	99
query85	893	461	421	421
query86	379	311	293	293
query87	3758	3763	3679	3679
query88	4147	2232	2225	2225
query89	396	318	297	297
query90	2017	218	214	214
query91	168	165	131	131
query92	82	66	68	66
query93	2397	966	632	632
query94	694	455	338	338
query95	405	317	315	315
query96	492	589	283	283
query97	2886	3050	2872	2872
query98	253	218	213	213
query99	1343	1417	1291	1291
Total cold run time: 279086 ms
Total hot run time: 189827 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.87 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0dbf100383b0c280b2fa5ef28806cc3067f87851, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.05
query3	0.26	0.08	0.08
query4	1.60	0.11	0.12
query5	0.28	0.26	0.25
query6	1.20	0.66	0.65
query7	0.03	0.03	0.03
query8	0.06	0.04	0.05
query9	0.64	0.53	0.52
query10	0.59	0.59	0.58
query11	0.16	0.14	0.11
query12	0.15	0.13	0.13
query13	0.62	0.60	0.60
query14	1.01	1.02	1.03
query15	0.85	0.83	0.87
query16	0.38	0.40	0.40
query17	1.06	1.01	1.01
query18	0.21	0.20	0.20
query19	1.86	1.86	1.78
query20	0.02	0.01	0.01
query21	15.46	0.18	0.13
query22	5.10	0.07	0.05
query23	15.65	0.26	0.11
query24	2.78	1.11	0.69
query25	0.09	0.08	0.06
query26	0.14	0.14	0.14
query27	0.07	0.05	0.05
query28	5.76	1.15	0.93
query29	12.59	4.08	3.27
query30	0.28	0.13	0.12
query31	2.82	0.60	0.39
query32	3.23	0.54	0.48
query33	3.13	3.13	3.09
query34	15.84	5.13	4.51
query35	4.59	4.54	4.61
query36	0.67	0.51	0.50
query37	0.10	0.06	0.07
query38	0.06	0.05	0.04
query39	0.03	0.03	0.03
query40	0.18	0.14	0.14
query41	0.08	0.04	0.03
query42	0.04	0.03	0.03
query43	0.05	0.03	0.03
Total cold run time: 99.86 s
Total hot run time: 27.87 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (7/7) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 22, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit 1553bc6 into apache:master Oct 22, 2025
35 of 36 checks passed
dwdwqfwe pushed a commit to dwdwqfwe/doris that referenced this pull request Oct 24, 2025
…che#57232)

### What problem does this PR solve?

Related PR: apache#33016

Introduced from apache#33016, when specify the "schema" property in outfile
clause with parquet format,
it will return error:
```
Parquet schema number does not equal to select item number
```

This is because we wrongly analyze `OutfileClause` twice.
github-actions bot pushed a commit that referenced this pull request Oct 30, 2025
)

### What problem does this PR solve?

Related PR: #33016

Introduced from #33016, when specify the "schema" property in outfile
clause with parquet format,
it will return error:
```
Parquet schema number does not equal to select item number
```

This is because we wrongly analyze `OutfileClause` twice.
morningman added a commit to morningman/doris that referenced this pull request Oct 30, 2025
…che#57232)

Related PR: apache#33016

Introduced from apache#33016, when specify the "schema" property in outfile
clause with parquet format,
it will return error:
```
Parquet schema number does not equal to select item number
```

This is because we wrongly analyze `OutfileClause` twice.
yiguolei pushed a commit that referenced this pull request Oct 31, 2025
… schema #57232 (#57493)

Cherry-picked from #57232

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

Comments