[opt](multi-catalog) Optimize file split size.#58858
Merged
morningman merged 1 commit intoapache:masterfrom Jan 5, 2026
Merged
[opt](multi-catalog) Optimize file split size.#58858morningman merged 1 commit intoapache:masterfrom
morningman merged 1 commit intoapache:masterfrom
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
794ef06 to
2a5c1e0
Compare
Contributor
Author
|
run buildall |
2a5c1e0 to
467baba
Compare
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 36237 ms |
TPC-DS: Total hot run time: 181948 ms |
ClickBench: Total hot run time: 27.27 s |
467baba to
97cbf6a
Compare
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 36184 ms |
TPC-DS: Total hot run time: 180842 ms |
ClickBench: Total hot run time: 27.16 s |
Contributor
FE Regression Coverage ReportIncrement line coverage |
97cbf6a to
64bfbb1
Compare
Contributor
Author
|
run buildall |
64bfbb1 to
828927e
Compare
Contributor
Author
|
run buildall |
1 similar comment
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 35915 ms |
be00184 to
01fe74b
Compare
TPC-DS: Total hot run time: 181296 ms |
ClickBench: Total hot run time: 27.14 s |
01fe74b to
32b5147
Compare
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 36706 ms |
TPC-DS: Total hot run time: 181276 ms |
ClickBench: Total hot run time: 27.79 s |
Contributor
FE UT Coverage ReportIncrement line coverage |
20e2b8a to
da983ea
Compare
Contributor
Author
|
run buildall |
TPC-H: Total hot run time: 36478 ms |
TPC-DS: Total hot run time: 179069 ms |
ClickBench: Total hot run time: 27.18 s |
Contributor
Author
|
run feut |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
suxiaogang223
approved these changes
Jan 5, 2026
hubgeter
approved these changes
Jan 5, 2026
zzzxl1993
pushed a commit
to zzzxl1993/doris
that referenced
this pull request
Jan 13, 2026
### What problem does this PR solve? ### Release note This PR introduces a **dynamic and progressive file split size adjustment mechanism** to improve scan parallelism and resource utilization for external table scans, while avoiding excessive small splits or inefficiently large initial splits. #### 1. Split Size Adjustment Strategy ##### 1.1 Non-Batch Split Mode In non-batch split mode, a **two-phase split size selection strategy** is applied based on the total size of all input files: * The total size of all splits is calculated in advance. * If the total size **exceeds `maxInitialSplitNum * maxInitialSplitSize`**: * `split_size = maxSplitSize` (default **64MB**) * Otherwise: * `split_size = maxInitialSplitSize` (default **32MB**) This strategy reduces the number of splits for small datasets while improving parallelism for large-scale scans. --- ##### 1.2 Batch Split Mode In batch split mode, a **progressive split size adjustment strategy** is introduced: * As the total file size increases, * When the number of files gradually **exceeds `maxInitialSplitNum`**, * The `split_size` is **smoothly increased from `maxInitialSplitSize` (32MB) toward `maxSplitSize` (64MB)**. This approach avoids generating too many small splits at the early stage while gradually increasing scan parallelism as the workload grows, resulting in more stable scheduling and execution behavior. --- ##### 1.3 User-Specified Split Size (Backward Compatibility) This PR **preserves the session variable `file_split_size`** for user-defined split size configuration: * If `file_split_size` is explicitly set by the user: * The user-defined value takes precedence. * The dynamic split size adjustment logic is bypassed. * This ensures full backward compatibility with existing configurations and tuning practices. --- #### 2. Support Status by Data Source | Data Source | Non-Batch Split Mode | Batch Split Mode | Notes | | ----------- | -------------------- | ---------------- | ----------------------------------------------------- | | Hive | ✅ Supported | ✅ Supported | Uses Doris internal HDFS FileSplitter | | Iceberg | ✅ Supported | ❌ Not supported | File splitting is currently delegated to Iceberg APIs | | Paimon | ✅ Supported | ❌ Not supported | Only non-batch split mode is implemented | --- #### 3. New Hive HDFS FileSplitter Logic For Hive HDFS files, this PR introduces an enhanced file splitting strategy: 1. **Splits never span multiple HDFS blocks** * Prevents cross-block reads and avoids unnecessary IO overhead. 2. **Tail split optimization** * If the remaining file size is smaller than `split_size * 2`, * The remaining part is **evenly divided** into splits, * Preventing the creation of very small tail splits and improving overall scan efficiency. --- #### Summary * Introduces dynamic and progressive split size adjustment * Supports both batch and non-batch split modes * Preserves user-defined split size configuration for backward compatibility * Optimizes Hive HDFS file splitting to reduce small tail splits and cross-block IO
Merged
16 tasks
morningman
pushed a commit
that referenced
this pull request
Jan 29, 2026
…revent OOM in file scan (#58759) ### What problem does this PR solve? - Relate Pr: #58858 ## Problem Summary When querying external table catalog (Hive, Iceberg, Paimon, etc.), Doris splits files into multiple splits for parallel processing. In some cases, especially with numerous small files, this can generate an excessive number of splits, potentially causing: 1. **Memory pressure**: Too many splits consume significant memory in FE 2. **OOM issues**: Excessive split generation can lead to OutOfMemoryError 3. **Performance degradation**: Managing too many splits impacts query planning overhead Previously, there was no upper limit on the number of splits in non-batch mode, which could lead to problems when querying tables with many small files. ## Solution This PR introduces a new session variable `max_file_split_num` to limit the maximum number of splits allowed per table scan in non-batch mode. ### Changes 1. **New Session Variable**: `max_file_split_num` - Type: `int` - Default: `100000` - Description: "在非 batch 模式下,每个 table scan 最大允许的 split 数量,防止产生过多 split 导致 OOM。" - Forward to BE: `true` 2. **Implementation in FileQueryScanNode**: - Added method `applyMaxFileSplitNumLimit(long targetSplitSize, long totalFileSize)` - Dynamically calculates minimum split size to ensure split count doesn't exceed the limit - Formula: `minSplitSizeForMaxNum = (totalFileSize + maxFileSplitNum - 1) / maxFileSplitNum` - Returns: `Math.max(targetSplitSize, minSplitSizeForMaxNum)` 3. **Applied to multiple scan nodes**: - `HiveScanNode` - `IcebergScanNode` - `PaimonScanNode` - `TVFScanNode` 4. **Unit Tests**: - `FileQueryScanNodeTest`: Test base logic - `HiveScanNodeTest`: Test Hive-specific implementation - `IcebergScanNodeTest`: Test Iceberg-specific implementation - `PaimonScanNodeTest`: Test Paimon-specific implementation - `TVFScanNodeTest`: Test TVF-specific implementation ## Usage Users can now control the maximum number of splits per table scan by setting the session variable: ```sql -- Set to 50000 splits maximum SET max_file_split_num = 50000; -- Disable the limit (set to 0 or negative) SET max_file_split_num = 0; ```
kaka11chen
added a commit
to kaka11chen/doris
that referenced
this pull request
Feb 10, 2026
### What problem does this PR solve? ### Release note This PR introduces a **dynamic and progressive file split size adjustment mechanism** to improve scan parallelism and resource utilization for external table scans, while avoiding excessive small splits or inefficiently large initial splits. #### 1. Split Size Adjustment Strategy ##### 1.1 Non-Batch Split Mode In non-batch split mode, a **two-phase split size selection strategy** is applied based on the total size of all input files: * The total size of all splits is calculated in advance. * If the total size **exceeds `maxInitialSplitNum * maxInitialSplitSize`**: * `split_size = maxSplitSize` (default **64MB**) * Otherwise: * `split_size = maxInitialSplitSize` (default **32MB**) This strategy reduces the number of splits for small datasets while improving parallelism for large-scale scans. --- ##### 1.2 Batch Split Mode In batch split mode, a **progressive split size adjustment strategy** is introduced: * As the total file size increases, * When the number of files gradually **exceeds `maxInitialSplitNum`**, * The `split_size` is **smoothly increased from `maxInitialSplitSize` (32MB) toward `maxSplitSize` (64MB)**. This approach avoids generating too many small splits at the early stage while gradually increasing scan parallelism as the workload grows, resulting in more stable scheduling and execution behavior. --- ##### 1.3 User-Specified Split Size (Backward Compatibility) This PR **preserves the session variable `file_split_size`** for user-defined split size configuration: * If `file_split_size` is explicitly set by the user: * The user-defined value takes precedence. * The dynamic split size adjustment logic is bypassed. * This ensures full backward compatibility with existing configurations and tuning practices. --- #### 2. Support Status by Data Source | Data Source | Non-Batch Split Mode | Batch Split Mode | Notes | | ----------- | -------------------- | ---------------- | ----------------------------------------------------- | | Hive | ✅ Supported | ✅ Supported | Uses Doris internal HDFS FileSplitter | | Iceberg | ✅ Supported | ❌ Not supported | File splitting is currently delegated to Iceberg APIs | | Paimon | ✅ Supported | ❌ Not supported | Only non-batch split mode is implemented | --- #### 3. New Hive HDFS FileSplitter Logic For Hive HDFS files, this PR introduces an enhanced file splitting strategy: 1. **Splits never span multiple HDFS blocks** * Prevents cross-block reads and avoids unnecessary IO overhead. 2. **Tail split optimization** * If the remaining file size is smaller than `split_size * 2`, * The remaining part is **evenly divided** into splits, * Preventing the creation of very small tail splits and improving overall scan efficiency. --- #### Summary * Introduces dynamic and progressive split size adjustment * Supports both batch and non-batch split modes * Preserves user-defined split size configuration for backward compatibility * Optimizes Hive HDFS file splitting to reduce small tail splits and cross-block IO
16 tasks
yiguolei
pushed a commit
that referenced
this pull request
Feb 12, 2026
### What problem does this PR solve? Problem Summary: ### Release note Cherry-pick #58858 ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
suxiaogang223
added a commit
to suxiaogang223/doris
that referenced
this pull request
Feb 13, 2026
…revent OOM in file scan (apache#58759) ### What problem does this PR solve? - Relate Pr: apache#58858 ## Problem Summary When querying external table catalog (Hive, Iceberg, Paimon, etc.), Doris splits files into multiple splits for parallel processing. In some cases, especially with numerous small files, this can generate an excessive number of splits, potentially causing: 1. **Memory pressure**: Too many splits consume significant memory in FE 2. **OOM issues**: Excessive split generation can lead to OutOfMemoryError 3. **Performance degradation**: Managing too many splits impacts query planning overhead Previously, there was no upper limit on the number of splits in non-batch mode, which could lead to problems when querying tables with many small files. ## Solution This PR introduces a new session variable `max_file_split_num` to limit the maximum number of splits allowed per table scan in non-batch mode. ### Changes 1. **New Session Variable**: `max_file_split_num` - Type: `int` - Default: `100000` - Description: "在非 batch 模式下,每个 table scan 最大允许的 split 数量,防止产生过多 split 导致 OOM。" - Forward to BE: `true` 2. **Implementation in FileQueryScanNode**: - Added method `applyMaxFileSplitNumLimit(long targetSplitSize, long totalFileSize)` - Dynamically calculates minimum split size to ensure split count doesn't exceed the limit - Formula: `minSplitSizeForMaxNum = (totalFileSize + maxFileSplitNum - 1) / maxFileSplitNum` - Returns: `Math.max(targetSplitSize, minSplitSizeForMaxNum)` 3. **Applied to multiple scan nodes**: - `HiveScanNode` - `IcebergScanNode` - `PaimonScanNode` - `TVFScanNode` 4. **Unit Tests**: - `FileQueryScanNodeTest`: Test base logic - `HiveScanNodeTest`: Test Hive-specific implementation - `IcebergScanNodeTest`: Test Iceberg-specific implementation - `PaimonScanNodeTest`: Test Paimon-specific implementation - `TVFScanNodeTest`: Test TVF-specific implementation ## Usage Users can now control the maximum number of splits per table scan by setting the session variable: ```sql -- Set to 50000 splits maximum SET max_file_split_num = 50000; -- Disable the limit (set to 0 or negative) SET max_file_split_num = 0; ``` (cherry picked from commit 3e5a70f)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Release note
This PR introduces a dynamic and progressive file split size adjustment mechanism to improve scan parallelism and resource utilization for external table scans, while avoiding excessive small splits or inefficiently large initial splits.
1. Split Size Adjustment Strategy
1.1 Non-Batch Split Mode
In non-batch split mode, a two-phase split size selection strategy is applied based on the total size of all input files:
The total size of all splits is calculated in advance.
If the total size exceeds
maxInitialSplitNum * maxInitialSplitSize:split_size = maxSplitSize(default 64MB)Otherwise:
split_size = maxInitialSplitSize(default 32MB)This strategy reduces the number of splits for small datasets while improving parallelism for large-scale scans.
1.2 Batch Split Mode
In batch split mode, a progressive split size adjustment strategy is introduced:
maxInitialSplitNum,split_sizeis smoothly increased frommaxInitialSplitSize(32MB) towardmaxSplitSize(64MB).This approach avoids generating too many small splits at the early stage while gradually increasing scan parallelism as the workload grows, resulting in more stable scheduling and execution behavior.
1.3 User-Specified Split Size (Backward Compatibility)
This PR preserves the session variable
file_split_sizefor user-defined split size configuration:If
file_split_sizeis explicitly set by the user:This ensures full backward compatibility with existing configurations and tuning practices.
2. Support Status by Data Source
3. New Hive HDFS FileSplitter Logic
For Hive HDFS files, this PR introduces an enhanced file splitting strategy:
Splits never span multiple HDFS blocks
Tail split optimization
split_size * 2,Summary
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)