[feat](nereids) add rewrite rule :EliminateGroupByKeyByUniform#43391
Merged
morrySnow merged 34 commits intoapache:masterfrom Dec 2, 2024
Merged
[feat](nereids) add rewrite rule :EliminateGroupByKeyByUniform#43391morrySnow merged 34 commits intoapache:masterfrom
morrySnow merged 34 commits intoapache:masterfrom
Conversation
Contributor
Author
|
run buildall |
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
morrySnow
reviewed
Nov 7, 2024
| import java.util.Map; | ||
| import java.util.Set; | ||
|
|
||
| /**ProjectFilterTransform*/ |
Contributor
There was a problem hiding this comment.
class comment should contain what the rule want to do and how
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 41389 ms |
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 41292 ms |
e668096 to
452bb5f
Compare
Contributor
Author
|
run buildall |
0c01e45 to
6333e6f
Compare
Contributor
Author
|
run buildall |
6f13f10 to
892b7ec
Compare
Contributor
Author
|
run buildall |
6b4a551 to
852065c
Compare
Contributor
Author
|
run buidlall |
b393ff3 to
0c470a6
Compare
Contributor
Author
|
run buildall |
73c21af to
c2bc25a
Compare
Contributor
Author
|
run buildall |
cdb345c to
c99e369
Compare
Contributor
Author
|
run buildall |
1d2ad46 to
d211163
Compare
Contributor
Author
|
run buildall |
ClickBench: Total hot run time: 32.14 s |
Contributor
Author
|
run p0 |
xzj7019
reviewed
Nov 28, 2024
| sql "insert into test1 values(1,1),(2,1),(3,1);" | ||
| sql "create table test2(a int, b int) distributed by hash(a) properties('replication_num'='1');" | ||
| sql "insert into test2 values(1,105),(2,105);" | ||
| qt_full_join_uniform_should_not_eliminate_group_by_key "select t2.b,t1.b from test1 t1 full join (select * from test2 where b=105) t2 on t1.a=t2.a group by t2.b,t1.b order by 1,2;" |
Contributor
There was a problem hiding this comment.
does the original code have bug for this case?
xzj7019
approved these changes
Nov 29, 2024
englefly
approved these changes
Nov 29, 2024
Contributor
|
PR approved by at least one committer and no changes requested. |
feiniaofeiafei
added a commit
to feiniaofeiafei/doris
that referenced
this pull request
Dec 6, 2024
…e#43391) This PR introduces two main changes: 1. Adds an optional constant value to the uniform attribute in DataTrait. A slot with a constant value that is not null will be considered uniform and not null. 2. Introduces a new transform rule: EliminateGroupByKeyByUniform, which utilizes the newly added part of the uniform attribute. Following is example transformation: from +--aggregate(group by a,b output a,b,max(c)) (a is uniform and not null: e.g. a is projection 2 as a in logicalProject) to +--aggregate(group by b output b,any_value(a) as a,max(c))
morrySnow
pushed a commit
that referenced
this pull request
Dec 10, 2024
16 tasks
Closed
16 tasks
16 tasks
dataroaring
pushed a commit
that referenced
this pull request
Nov 6, 2025
…niform when group sets exist (#56942) Fix not in aggregate's output err after eliminate by uniform when group sets exist if query as following, would cause `ERROR 1105 (HY000): errCode = 2, detailMessage = GROUPING_PREFIX_event_name_group not in aggregate's output` the pr fix this ```sql SELECT CASE WHEN GROUPING(event_date) = 1 THEN '(TOTAL)' ELSE CAST(event_date AS VARCHAR) END AS event_date, user_id, MAX(conversion_level) AS conversion_level, CASE WHEN GROUPING(event_name_group) = 1 THEN '(TOTAL)' ELSE event_name_group END AS event_name_group FROM ( SELECT src.event_date, src.user_id, WINDOW_FUNNEL( 3600 * 24 * 1, 'default', src.event_time, src.event_name = 'shop_buy', src.event_name = 'shop_buy' ) AS conversion_level, src.event_name_group FROM ( SELECT CAST(etb.`@dt` AS DATE) AS event_date, etb.`@event_name` AS event_name, etb.`@event_time` AS event_time, etb.`@event_name` AS event_name_group, etb.`@user_id` AS user_id FROM `test_event` AS etb WHERE etb.`@dt` between '2025-09-03 02:00:00' AND '2025-09-10 01:59:59' AND etb.`@event_name` = 'shop_buy' AND etb.`@user_id` IS NOT NULL AND etb.`@user_id` > '0' ) AS src GROUP BY src.event_date, src.user_id, src.event_name_group ) AS fwt GROUP BY GROUPING SETS ( (user_id), (user_id, event_date), (user_id, event_name_group), (user_id, event_date, event_name_group) ); ``` ### What problem does this PR solve? Issue Number: close #xxx Related PR: #43391 Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
github-actions bot
pushed a commit
that referenced
this pull request
Nov 6, 2025
…niform when group sets exist (#56942) Fix not in aggregate's output err after eliminate by uniform when group sets exist if query as following, would cause `ERROR 1105 (HY000): errCode = 2, detailMessage = GROUPING_PREFIX_event_name_group not in aggregate's output` the pr fix this ```sql SELECT CASE WHEN GROUPING(event_date) = 1 THEN '(TOTAL)' ELSE CAST(event_date AS VARCHAR) END AS event_date, user_id, MAX(conversion_level) AS conversion_level, CASE WHEN GROUPING(event_name_group) = 1 THEN '(TOTAL)' ELSE event_name_group END AS event_name_group FROM ( SELECT src.event_date, src.user_id, WINDOW_FUNNEL( 3600 * 24 * 1, 'default', src.event_time, src.event_name = 'shop_buy', src.event_name = 'shop_buy' ) AS conversion_level, src.event_name_group FROM ( SELECT CAST(etb.`@dt` AS DATE) AS event_date, etb.`@event_name` AS event_name, etb.`@event_time` AS event_time, etb.`@event_name` AS event_name_group, etb.`@user_id` AS user_id FROM `test_event` AS etb WHERE etb.`@dt` between '2025-09-03 02:00:00' AND '2025-09-10 01:59:59' AND etb.`@event_name` = 'shop_buy' AND etb.`@user_id` IS NOT NULL AND etb.`@user_id` > '0' ) AS src GROUP BY src.event_date, src.user_id, src.event_name_group ) AS fwt GROUP BY GROUPING SETS ( (user_id), (user_id, event_date), (user_id, event_name_group), (user_id, event_date, event_name_group) ); ``` ### What problem does this PR solve? Issue Number: close #xxx Related PR: #43391 Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
yiguolei
pushed a commit
that referenced
this pull request
Nov 8, 2025
…niform when group sets exist (#56942) Fix not in aggregate's output err after eliminate by uniform when group sets exist if query as following, would cause `ERROR 1105 (HY000): errCode = 2, detailMessage = GROUPING_PREFIX_event_name_group not in aggregate's output` the pr fix this ```sql SELECT CASE WHEN GROUPING(event_date) = 1 THEN '(TOTAL)' ELSE CAST(event_date AS VARCHAR) END AS event_date, user_id, MAX(conversion_level) AS conversion_level, CASE WHEN GROUPING(event_name_group) = 1 THEN '(TOTAL)' ELSE event_name_group END AS event_name_group FROM ( SELECT src.event_date, src.user_id, WINDOW_FUNNEL( 3600 * 24 * 1, 'default', src.event_time, src.event_name = 'shop_buy', src.event_name = 'shop_buy' ) AS conversion_level, src.event_name_group FROM ( SELECT CAST(etb.`@dt` AS DATE) AS event_date, etb.`@event_name` AS event_name, etb.`@event_time` AS event_time, etb.`@event_name` AS event_name_group, etb.`@user_id` AS user_id FROM `test_event` AS etb WHERE etb.`@dt` between '2025-09-03 02:00:00' AND '2025-09-10 01:59:59' AND etb.`@event_name` = 'shop_buy' AND etb.`@user_id` IS NOT NULL AND etb.`@user_id` > '0' ) AS src GROUP BY src.event_date, src.user_id, src.event_name_group ) AS fwt GROUP BY GROUPING SETS ( (user_id), (user_id, event_date), (user_id, event_name_group), (user_id, event_date, event_name_group) ); ``` ### What problem does this PR solve? Issue Number: close #xxx Related PR: #43391 Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
seawinde
added a commit
to seawinde/doris
that referenced
this pull request
Nov 10, 2025
…niform when group sets exist (apache#56942) Fix not in aggregate's output err after eliminate by uniform when group sets exist if query as following, would cause `ERROR 1105 (HY000): errCode = 2, detailMessage = GROUPING_PREFIX_event_name_group not in aggregate's output` the pr fix this ```sql SELECT CASE WHEN GROUPING(event_date) = 1 THEN '(TOTAL)' ELSE CAST(event_date AS VARCHAR) END AS event_date, user_id, MAX(conversion_level) AS conversion_level, CASE WHEN GROUPING(event_name_group) = 1 THEN '(TOTAL)' ELSE event_name_group END AS event_name_group FROM ( SELECT src.event_date, src.user_id, WINDOW_FUNNEL( 3600 * 24 * 1, 'default', src.event_time, src.event_name = 'shop_buy', src.event_name = 'shop_buy' ) AS conversion_level, src.event_name_group FROM ( SELECT CAST(etb.`@dt` AS DATE) AS event_date, etb.`@event_name` AS event_name, etb.`@event_time` AS event_time, etb.`@event_name` AS event_name_group, etb.`@user_id` AS user_id FROM `test_event` AS etb WHERE etb.`@dt` between '2025-09-03 02:00:00' AND '2025-09-10 01:59:59' AND etb.`@event_name` = 'shop_buy' AND etb.`@user_id` IS NOT NULL AND etb.`@user_id` > '0' ) AS src GROUP BY src.event_date, src.user_id, src.event_name_group ) AS fwt GROUP BY GROUPING SETS ( (user_id), (user_id, event_date), (user_id, event_name_group), (user_id, event_date, event_name_group) ); ``` ### What problem does this PR solve? Issue Number: close #xxx Related PR: apache#43391 Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
seawinde
added a commit
to seawinde/doris
that referenced
this pull request
Nov 11, 2025
…niform when group sets exist (apache#56942) Fix not in aggregate's output err after eliminate by uniform when group sets exist if query as following, would cause `ERROR 1105 (HY000): errCode = 2, detailMessage = GROUPING_PREFIX_event_name_group not in aggregate's output` the pr fix this ```sql SELECT CASE WHEN GROUPING(event_date) = 1 THEN '(TOTAL)' ELSE CAST(event_date AS VARCHAR) END AS event_date, user_id, MAX(conversion_level) AS conversion_level, CASE WHEN GROUPING(event_name_group) = 1 THEN '(TOTAL)' ELSE event_name_group END AS event_name_group FROM ( SELECT src.event_date, src.user_id, WINDOW_FUNNEL( 3600 * 24 * 1, 'default', src.event_time, src.event_name = 'shop_buy', src.event_name = 'shop_buy' ) AS conversion_level, src.event_name_group FROM ( SELECT CAST(etb.`@dt` AS DATE) AS event_date, etb.`@event_name` AS event_name, etb.`@event_time` AS event_time, etb.`@event_name` AS event_name_group, etb.`@user_id` AS user_id FROM `test_event` AS etb WHERE etb.`@dt` between '2025-09-03 02:00:00' AND '2025-09-10 01:59:59' AND etb.`@event_name` = 'shop_buy' AND etb.`@user_id` IS NOT NULL AND etb.`@user_id` > '0' ) AS src GROUP BY src.event_date, src.user_id, src.event_name_group ) AS fwt GROUP BY GROUPING SETS ( (user_id), (user_id, event_date), (user_id, event_name_group), (user_id, event_date, event_name_group) ); ``` Issue Number: close #xxx Related PR: apache#43391 Problem Summary: None - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
wyxxxcat
pushed a commit
to wyxxxcat/doris
that referenced
this pull request
Nov 18, 2025
…niform when group sets exist (apache#56942) Fix not in aggregate's output err after eliminate by uniform when group sets exist if query as following, would cause `ERROR 1105 (HY000): errCode = 2, detailMessage = GROUPING_PREFIX_event_name_group not in aggregate's output` the pr fix this ```sql SELECT CASE WHEN GROUPING(event_date) = 1 THEN '(TOTAL)' ELSE CAST(event_date AS VARCHAR) END AS event_date, user_id, MAX(conversion_level) AS conversion_level, CASE WHEN GROUPING(event_name_group) = 1 THEN '(TOTAL)' ELSE event_name_group END AS event_name_group FROM ( SELECT src.event_date, src.user_id, WINDOW_FUNNEL( 3600 * 24 * 1, 'default', src.event_time, src.event_name = 'shop_buy', src.event_name = 'shop_buy' ) AS conversion_level, src.event_name_group FROM ( SELECT CAST(etb.`@dt` AS DATE) AS event_date, etb.`@event_name` AS event_name, etb.`@event_time` AS event_time, etb.`@event_name` AS event_name_group, etb.`@user_id` AS user_id FROM `test_event` AS etb WHERE etb.`@dt` between '2025-09-03 02:00:00' AND '2025-09-10 01:59:59' AND etb.`@event_name` = 'shop_buy' AND etb.`@user_id` IS NOT NULL AND etb.`@user_id` > '0' ) AS src GROUP BY src.event_date, src.user_id, src.event_name_group ) AS fwt GROUP BY GROUPING SETS ( (user_id), (user_id, event_date), (user_id, event_name_group), (user_id, event_date, event_name_group) ); ``` ### What problem does this PR solve? Issue Number: close #xxx Related PR: apache#43391 Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
This PR introduces two main changes:
+--aggregate(group by a,b output a,b,max(c))
(a is uniform and not null: e.g. a is projection 2 as a in logicalProject)
->
+--aggregate(group by b output b,any_value(a) as a,max(c))
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Release note
None
Check List (For Reviewer who merge this PR)