[fix](repeat plan) fix repeat output slot order inconsistent with its input expression#60045
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 31367 ms |
TPC-DS: Total hot run time: 173815 ms |
ClickBench: Total hot run time: 26.69 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
There was a problem hiding this comment.
Pull request overview
This pull request fixes a critical bug in the Repeat node's output slot ordering in the Nereids planner. The BE requires that the repeat node's output slots follow a specific order: input expressions + GroupingID + other grouping functions. The previous physical translator implementation did not guarantee this ordering, which caused bad cast exceptions in the backend when executing queries with GROUPING SETS.
Changes:
- Refactored the PhysicalPlanTranslator to ensure correct output slot ordering for Repeat nodes
- Added comprehensive regression tests covering the fixed scenarios
- Fixed the logic to explicitly add group by expressions first, followed by aggregate function slots, then GroupingID, and finally grouping function slots
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java | Refactored visitPhysicalRepeat method to ensure correct output slot ordering: group by expressions → aggregate function slots → grouping ID → grouping function slots |
| regression-test/suites/nereids_p0/repeat/test_repeat_output_slot.groovy | Added regression test with two SQL queries using GROUPING SETS to verify the fix |
| regression-test/data/nereids_p0/repeat/test_repeat_output_slot.out | Expected output for the regression tests, including query plans and results |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
Outdated
Show resolved
Hide resolved
morrySnow
left a comment
There was a problem hiding this comment.
add a ut for translate repeat
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
Outdated
Show resolved
Hide resolved
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 31381 ms |
TPC-DS: Total hot run time: 173591 ms |
ClickBench: Total hot run time: 26.56 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
… input expression (#60045) maybe relate PR: #21168 BE requires that the repeat node's output slot order should be inconsistent with its input expressions. That is output slots = input expressions + GroupingID + other grouping functions. But physical translator not ensure this requirement. Then sometimes the repeat may have bad cast exception. for sql: SELECT 100000 FROM db2.table_9_50_undef_partitions2_keys3_properties4_distributed_by5 GROUP BY GROUPING SETS ( (col_datetime_6__undef_signed, col_varchar_50__undef_signed) , () , (col_varchar_50__undef_signed) , (col_datetime_6__undef_signed, col_varchar_50__undef_signed) ); the above sql will have wrong ouput slot order then BE will have exceptions: (1105, 'errCode = 2, detailMessage = (172.20.57.146) [E-7412]assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)26> to doris::vectorized::ColumnStr<unsigned int> 0#doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /home/zcp/repo_center/doris_master/doris/be/src/common/exception.cpp:0\n\t1# ...
… input expression (apache#60045) maybe relate PR: apache#21168 BE requires that the repeat node's output slot order should be inconsistent with its input expressions. That is output slots = input expressions + GroupingID + other grouping functions. But physical translator not ensure this requirement. Then sometimes the repeat may have bad cast exception. for sql: SELECT 100000 FROM db2.table_9_50_undef_partitions2_keys3_properties4_distributed_by5 GROUP BY GROUPING SETS ( (col_datetime_6__undef_signed, col_varchar_50__undef_signed) , () , (col_varchar_50__undef_signed) , (col_datetime_6__undef_signed, col_varchar_50__undef_signed) ); the above sql will have wrong ouput slot order then BE will have exceptions: (1105, 'errCode = 2, detailMessage = (172.20.57.146) [E-7412]assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)26> to doris::vectorized::ColumnStr<unsigned int> 0#doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /home/zcp/repo_center/doris_master/doris/be/src/common/exception.cpp:0\n\t1# ...
…ation in DecomposeRepeatWithPreAggregation (#60091) Related PR: #59116 #60045 **Problem Summary:** There are 2 problems: 1. When `DecomposeRepeatWithPreAggregation` optimization removes the maximum grouping set, grouping functions (e.g., `grouping()`, `grouping_id()`) that reference expressions only existing in the removed grouping set would fail because of index lookup failure: When computing grouping function values, `GroupingSetShapes.indexOf()` would return -1 for expressions not in `flattenGroupingSetExpression`, causing incorrect behavior. Example scenario: ```sql SELECT a, b, c, grouping(c) FROM t1 GROUP BY GROUPING SETS ((a, b, c), (a, b), (a), ()); ``` After optimization removes the maximum grouping set `(a, b, c)`, the expression `c` no longer exists in the remaining grouping sets. When computing `grouping(c)`, the index lookup would fail. 2. The `repeatSlotIdList` calculation was incorrect because it relied on the assumption that the order of `physicalRepeat.getOutputExpressions()` matches the order of `outputTuple`. However, this assumption can be broken during rule rewriting (e.g., in `DecomposeRepeatWithPreAggregation.constructRepeat()` at line 502-503, where `replacedRepeatOutputs` is constructed by `new ArrayList<>(child.getOutput())` and then appends grouping functions, potentially changing the order). This mismatch causes incorrect slot ID mapping and wrong query results. Example scenario: When `DecomposeRepeatWithPreAggregation` rewrites a repeat plan, the output expressions order may differ from the output tuple order, leading to incorrect slot ID mapping in `repeatSlotIdList`. **Solution:** 1. For problem 1: Modified `Repeat.toShapes()` method to ensure all expressions referenced by grouping functions are included in `flattenGroupingSetExpression`, even if they are not in any grouping set. For expressions that only exist in the removed maximum grouping set, they are correctly marked as `shouldBeErasedToNull = true` in all remaining grouping sets, which is the correct SQL semantics. Key changes: - `Repeat.toShapes()`: Enhanced to collect all expressions referenced by grouping functions using `ExpressionUtils.collectToList()` and merge them into `flattenGroupingSetExpression` - For expressions not in any grouping set, they are correctly marked as `shouldBeErasedToNull = true` in all `GroupingSetShape` instances - This ensures `GroupingSetShapes.indexOf()` always returns a valid index for grouping function arguments, preventing index lookup failures 2. For problem 2: Modified `Repeat.computeRepeatSlotIdList()` method to accept an additional `outputSlots` parameter. Instead of relying on the order of `outputExpressions` matching `outputTuple`, the method now uses `outputSlots` to correctly map grouping set expressions to their indices in the output tuple. This ensures correct slot ID calculation even when rule rewriting changes the order of output expressions. Key changes: - `computeRepeatSlotIdList(List<Integer> slotIdList, List<Slot> outputSlots)`: Added `outputSlots` parameter - `getGroupingSetsIndexesInOutput(List<Slot> outputSlots)`: Modified to use `outputSlots` instead of `getOutputExpressions()` - `indexesOfOutput(List<Slot> outputSlots)`: Modified to build index map from `outputSlots` instead of `getOutputExpressions()` - `PhysicalPlanTranslator.visitPhysicalRepeat()`: Updated to pass `outputSlots` to `computeRepeatSlotIdList()` 3. Another task for this pull request is to set the shuffle key for the rewritten bottom aggregation. Choosing a suitable shuffle key can improve the aggregation rate of the top aggregation's local aggregation.
intro by #59116
BE requires that the repeat node's output slot order should be inconsistent with its input expressions.
That is output slots = input expressions + GroupingID + other grouping functions.
But physical translator not ensure this requirement. Then sometimes the repeat may have bad cast exception.
for sql:
the above sql will have wrong ouput slot order:
then BE will have exceptions:
relate PR: #59116 may cause repeat's group by expression order not match repeat's output expression
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)