[None][chore] Fix disagg assert#10596

fredricz-20070104 · 2026-01-12T06:09:31Z

Summary by CodeRabbit

Tests
- Updated benchmark concurrency configurations across multiple test scenarios to use adjusted concurrency levels (e.g., 1024 instead of 1075, 2048 instead of 2150, and adjusted ranges).
- Enhanced validation logic for benchmark configurations to ensure concurrency settings align with batch size and parallelism parameters.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

fredricz-20070104 · 2026-01-12T06:11:16Z

/bot skip --comment "Not cover in CI pipelines"

coderabbitai · 2026-01-12T06:14:46Z

📝 Walkthrough

Walkthrough

This PR updates benchmark configuration files across multiple models to adjust concurrency list values, standardizing them to power-of-2 or rounded values. Additionally, it introduces parsing and validation functionality for a new concurrency_list configuration field in the benchmark utilities.

Changes

Cohort / File(s)	Summary
Benchmark Configuration Updates - Qwen3 Models `tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-{NIXL,UCX}.yaml`	Updated concurrency_list values: 512 1075→512 1024, 1→36 1 2 4 8 16 36→1 2 4 8 16 32, 2150→2048
Benchmark Configuration Updates - DeepSeek-R1 Models `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-{NIXL,UCX}.yaml` `tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-{NIXL,UCX}.yaml`	Standardized concurrency_list values across multiple configurations: 1075→1024, 36→32, 2150→2048, 18→16, 34→32, 538→512
Benchmark Configuration Updates - WidEEP Accuracy `tests/integration/defs/perf/disagg/test_configs/wideep/accuracy/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb288_mtp3_ccb-NIXL.yaml`	Updated concurrency_list from 2150 to 2048
Utility Functions - Configuration Parsing and Validation `tests/integration/defs/perf/disagg/utils/common.py`	Added parsing of `concurrency_list` from space-separated string to integer list in `extract_config_fields()` function
Utility Functions - Configuration Parsing and Validation `tests/integration/defs/perf/disagg/utils/config_validator.py`	Added new `_validate_concurrency_list()` method to validate concurrency list bounds based on gen_batch_size and gen_tp_size; relaxed `_validate_gen_max_tokens()` from equality to greater-than-or-equal check when mtp_size > 0; integrated concurrency_list validation into `validate_test_config()`

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is missing entirely. The template requires a clear explanation of the issue and solution, test coverage details, and a checklist, but none of these are provided.	Add a detailed description explaining what assertion was fixed, why the concurrency validation was needed, and which tests cover these changes.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[None][chore] Fix disagg assert' directly relates to the main change: fixing a disaggregation assertion and adding validation logic, which is reflected in the code changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai · 2026-01-12T06:16:11Z

📝 Walkthrough

Walkthrough

Benchmark configuration files across multiple test suites have concurrency parameter values updated to powers of two (1075→1024, 2150→2048, 36→32, 34→32, 18→16, 538→512). Simultaneously, utility modules add parsing and validation support for concurrency_list configuration field.

Changes

Cohort / File(s)	Summary
Disaggregated Perf Benchmark Configs `tests/integration/defs/perf/disagg/test_configs/disagg/perf/-NIXL.yaml`, `tests/integration/defs/perf/disagg/test_configs/disagg/perf/-UCX.yaml`	Updated `concurrency_list` values across 20 benchmark configuration files: single values adjusted (1075→1024, 2150→2048, 538→512) and space-separated lists modified (36→32, 34→32, 18→16). Each file contains a single-line value change.
Wideep Accuracy Benchmark Config `tests/integration/defs/perf/disagg/test_configs/wideep/accuracy/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb288_mtp3_ccb-NIXL.yaml`	Updated `concurrency_list` from 2150 to 2048.
Configuration Parsing Logic `tests/integration/defs/perf/disagg/utils/common.py`	Added extraction and parsing of `concurrency_list` field from benchmark configuration, converting space-separated string values to integer list.
Configuration Validation Logic `tests/integration/defs/perf/disagg/utils/config_validator.py`	Introduced new `_validate_concurrency_list()` method to validate concurrency values against batch size and tensor parallelism parameters. Modified `_validate_gen_max_tokens()` assertion from equality to greater-than-or-equal when mtp_size > 0.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description only contains '@coderabbitai summary' with no actual content explaining the changes, issues, test coverage, or following the required template sections.	Provide a comprehensive description including what was fixed, why the change was necessary, relevant test coverage, and a checklist confirmation of compliance with coding guidelines.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[None][chore] Fix disagg assert' is specific and directly related to the main changes, which involve fixing a disaggregation assertion and validating concurrency configurations.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/integration/defs/perf/disagg/utils/common.py (1)
1-4: Missing NVIDIA copyright header.

Per coding guidelines, all TensorRT-LLM source files (.py) should contain an NVIDIA copyright header with the year of latest meaningful modification. This file only has a docstring but no copyright notice.
Suggested fix
+# SPDX-FileCopyrightText: Copyright (c) 2022-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
 """Disaggregated Benchmark Configuration."""
Based on coding guidelines requiring NVIDIA copyright header for all source files.

🧹 Nitpick comments (3)

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-NIXL.yaml (1)

24-24: LGTM with a minor note.

The value is correctly updated from 2150 to 2048, aligning with power-of-two conventions.

Minor: The value is quoted as a string ('2048') while other config files use unquoted space-separated values. Consider using the unquoted format for consistency, though both should parse correctly.
tests/integration/defs/perf/disagg/utils/config_validator.py (1)
92-116: Move loop-invariant dictionary lookups outside the loop.

gen_batch_size and gen_tp_size don't change per iteration. Also, the docstring states Raises: ValueError but the method raises AssertionError.
♻️ Suggested refactor
     @staticmethod
     def _validate_concurrency_list(extracted_config: dict) -> None:
         """Validate concurrency list.
 
         Args:
             extracted_config: Extracted configuration fields
 
         Raises:
-            ValueError: If concurrency list is invalid
+            AssertionError: If concurrency list is invalid
         """
         concurrency_list = extracted_config["concurrency_list"]
         assert concurrency_list, "config error: concurrency_list is empty"
         gen_enable_dp = extracted_config["gen_enable_dp"]
+        gen_batch_size = extracted_config["gen_batch_size"]
         logger.info(f"gen_enable_dp: {gen_enable_dp}")
+        if gen_enable_dp:
+            gen_tp_size = extracted_config["gen_tp_size"]
+            max_concurrency = gen_batch_size * gen_tp_size
+        else:
+            max_concurrency = gen_batch_size
         for concurrency in concurrency_list:
-            gen_batch_size = extracted_config["gen_batch_size"]
-            if gen_enable_dp:
-                gen_tp_size = extracted_config["gen_tp_size"]
-                assert concurrency <= gen_batch_size * gen_tp_size, (
-                    "config error: concurrency exceeds gen_batch_size * gen_tp_size"
-                )
-            else:
-                assert concurrency <= gen_batch_size, (
-                    "config error: concurrency exceeds gen_batch_size"
-                )
+            assert concurrency <= max_concurrency, (
+                f"config error: concurrency {concurrency} exceeds max allowed {max_concurrency}"
+            )
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-UCX.yaml (1)
24-24: Consider format consistency across config files.

The value update to 2048 is correct. However, this file uses a quoted string format ('2048') while other configs in this PR use unquoted space-separated format (e.g., 1 2 4 8 16 32). While the parsing utility should handle both, consistent formatting across config files improves maintainability.
🔧 Suggested format for consistency
-  concurrency_list: '2048'
+  concurrency_list: 2048

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a9a00b and aa9bfbd.

📒 Files selected for processing (25)

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/wideep/accuracy/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb288_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/utils/common.py
tests/integration/defs/perf/disagg/utils/config_validator.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+
Indent Python code with 4 spaces. Do not use tabs
Always maintain the namespace when importing Python modules, even if only one class or function from a module is used
Python filenames should use snake_case (e.g., some_file.py)
Python classes should use PascalCase (e.g., class SomeClass)
Python functions and methods should use snake_case (e.g., def my_awesome_function():)
Python local variables should use snake_case, with prefix k for variable names that start with a number (e.g., k_99th_percentile)
Python global variables should use upper snake_case with prefix G (e.g., G_MY_GLOBAL)
Python constants should use upper snake_case (e.g., MY_CONSTANT)
Avoid shadowing variables declared in an outer scope in Python
Initialize all externally visible members of a Python class in the constructor
For Python interfaces that may be used outside a file, prefer docstrings over comments
Use comments in Python for code within a function, or interfaces that are local to a file
Use Google-style docstrings for Python classes and functions, which can be parsed by Sphinx
Python attributes and variables can be documented inline with the format """<type>: Description"""
Avoid using reflection in Python when functionality can be easily achieved without reflection
When using try-except blocks in Python, limit the except clause to the smallest set of errors possible
When using try-except blocks in Python to handle multiple possible variable types (duck-typing), keep the body of the try as small as possible and use the else block for the main logic

Files:

tests/integration/defs/perf/disagg/utils/common.py
tests/integration/defs/perf/disagg/utils/config_validator.py

**/*.{cpp,cc,cxx,h,hpp,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

All TensorRT-LLM source files (.cpp, .h, .cu, .py, and other source files) should contain an NVIDIA copyright header with the year of latest meaningful modification

Files:

tests/integration/defs/perf/disagg/utils/common.py
tests/integration/defs/perf/disagg/utils/config_validator.py

🧬 Code graph analysis (1)

tests/integration/defs/perf/disagg/utils/config_validator.py (1)

tests/integration/defs/perf/disagg/utils/logger.py (1)

info (81-83)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (25)

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency_list adjustment from 36 to 32 aligns the maximum concurrency with max_batch_size (line 54) and uses a power-of-2 value, which is consistent with the validation fix across this PR.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency value 1024 (power of 2) correctly aligns with the product of max_batch_size (32) × tensor_parallel_size (32) when enable_attention_dp is true.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency_list adjustment from 34 to 32 aligns the maximum concurrency with max_batch_size (line 54) and ensures power-of-2 values throughout.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

Consistent with the UCX variant—concurrency value 1024 properly aligns with the data-parallel configuration (max_batch_size × tensor_parallel_size).

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency value 2048 correctly aligns with max_batch_size (128) × tensor_parallel_size (16) for the data-parallel configuration.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency_list is correctly updated from 18 to 16, aligning with powers of two. The maximum concurrency value (16) now matches max_batch_size in the gen worker config (line 54), which is consistent.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

Consistent with the NIXL variant—concurrency_list is correctly updated to use powers of two, with the maximum matching max_batch_size.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency_list is correctly updated from 36 to 32, aligning with powers of two. The maximum concurrency value (32) now matches max_batch_size in the gen worker config (line 54).

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

Consistent with the NIXL variant—concurrency_list correctly updated to use powers of two, with the maximum matching max_batch_size.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency_list adjustment from 36 to 32 aligns with the power-of-2 pattern and is consistent with max_batch_size: 32 at line 54.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency_list adjustment from 1075 to 1024 aligns with the power-of-2 constraint addressed by this PR.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency_list adjustment from 538 to 512 aligns with the power-of-2 constraint.

tests/integration/defs/perf/disagg/utils/common.py (2)

195-197: LGTM!

The parsing logic correctly handles space-separated values with proper stripping and filtering of empty strings. The list comprehension is idiomatic Python.

232-232: LGTM!

The concurrency_list is properly included in the returned dictionary.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency_list adjustment from "512 1075" to "512 1024" ensures all values are powers of 2, consistent with the PR objective.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency value of 2048 correctly aligns with the maximum allowed when attention DP is enabled: gen_batch_size (128) * gen_tp_size (16) = 2048.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency value of 512 correctly aligns with the maximum allowed when attention DP is enabled: gen_batch_size (16) * gen_tp_size (32) = 512.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency list values (1, 2, 4, 8, 16, 32) all correctly stay within the maximum allowed when attention DP is disabled: gen_batch_size = 32.

tests/integration/defs/perf/disagg/utils/config_validator.py (2)

40-41: LGTM!

The new validation call integrates cleanly with the existing validation flow.

56-60: LGTM!

The relaxed assertion (>= instead of ==) and updated error message correctly reflect the intended validation semantics.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency value of 1024 correctly aligns with the maximum allowed when attention DP is enabled: gen_batch_size (64) * gen_tp_size (16) = 1024.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency list is correctly updated to use power-of-two values, and the maximum concurrency (32) aligns with gen.max_batch_size (32) at line 54. This standardization is appropriate for GPU workloads.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency list update from 1075 to 1024 standardizes to a power of two. The high concurrency values (512, 1024) relative to gen.max_batch_size (64) are valid given enable_attention_dp: true configuration at line 52.

tests/integration/defs/perf/disagg/test_configs/wideep/accuracy/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb288_mtp3_ccb-NIXL.yaml (1)

31-31: LGTM with minor format note.

The value update from 2150 to 2048 correctly standardizes to a power of two. Same optional consistency note as the previous file regarding the quoted string format.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency list correctly standardizes to power-of-two values (1, 2, 4, 8, 16, 32), with maximum concurrency matching gen.max_batch_size (32) at line 54. This mirrors the UCX variant config appropriately.

tensorrt-cicd · 2026-01-12T06:17:07Z

PR_Github #31491 [ skip ] triggered by Bot. Commit: aa9bfbd

fredricz-20070104 · 2026-01-12T06:29:23Z

/bot skip --comment "Not cover in CI pipelines"

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @tests/integration/defs/perf/disagg/utils/config_validator.py:
- Around line 99-100: The docstring in
tests/integration/defs/perf/disagg/utils/config_validator.py claims "Raises:
ValueError" but the implementation uses assert statements (which raise
AssertionError); update the function to be consistent by replacing those assert
checks with explicit raises of ValueError (preserving the same error messages),
or alternatively update the docstring to state "Raises: AssertionError" — locate
the function containing the "Raises: ValueError" docstring and modify the
assert-based checks to raise ValueError (e.g., raise ValueError("...")) so the
raised exception type matches the docstring.

🧹 Nitpick comments (5)

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM with minor formatting note.

The max concurrency value of 32 correctly matches max_batch_size.

Note: This file uses unquoted space-separated format while other files in this PR use quoted strings (e.g., '1024'). Consider using consistent formatting across configuration files for maintainability, though the parser handles both.
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml (1)
24-24: Consider quoting the concurrency_list value for consistency.

The concurrency values themselves are correct (all power-of-2, max value 32 matches gen.max_batch_size). However, other config files in this PR use quoted strings (e.g., '2048', '512'). For consistency across the configuration files, consider quoting this value:
concurrency_list: '1 2 4 8 16 32'
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-NIXL.yaml (1)
24-24: LGTM on the values; consider quoting for consistency.

The fix from 18 to 16 is correct—the max concurrency now matches gen.max_batch_size (16), and all values are power-of-2. Since enable_attention_dp is false, concurrency is bounded by batch size alone.

For consistency with other config files using quoted strings (e.g., '512', '2048'), consider:
concurrency_list: '1 2 4 8 16'
tests/integration/defs/perf/disagg/utils/common.py (1)
195-197: Consider adding error handling for malformed concurrency values.

If concurrency_list contains non-integer values, this will raise an unhandled ValueError. Consider wrapping with a try-except to provide a more descriptive error message for invalid configuration.
♻️ Proposed improvement
-    concurrency_list = [
-        int(x.strip()) for x in config_data["benchmark"]["concurrency_list"].split() if x.strip()
-    ]
+    try:
+        concurrency_list = [
+            int(x.strip()) for x in config_data["benchmark"]["concurrency_list"].split() if x.strip()
+        ]
+    except ValueError as e:
+        raise ValueError(f"Invalid concurrency_list: must contain integers only") from e
tests/integration/defs/perf/disagg/utils/config_validator.py (1)
106-112: Move loop-invariant extractions outside the loop.

gen_batch_size and gen_tp_size are extracted inside the loop but their values are constant across iterations.
♻️ Proposed improvement
     concurrency_list = extracted_config["concurrency_list"]
     assert concurrency_list, "config error: concurrency_list is empty"
     gen_enable_dp = extracted_config["gen_enable_dp"]
+    gen_batch_size = extracted_config["gen_batch_size"]
+    if gen_enable_dp:
+        gen_tp_size = extracted_config["gen_tp_size"]
+        max_concurrency = gen_batch_size * gen_tp_size
+    else:
+        max_concurrency = gen_batch_size
     logger.info(f"gen_enable_dp: {gen_enable_dp}")
     for concurrency in concurrency_list:
-        gen_batch_size = extracted_config["gen_batch_size"]
-        if gen_enable_dp:
-            gen_tp_size = extracted_config["gen_tp_size"]
-            assert concurrency <= gen_batch_size * gen_tp_size, (
-                "config error: concurrency exceeds gen_batch_size * gen_tp_size"
-            )
-        else:
-            assert concurrency <= gen_batch_size, (
-                "config error: concurrency exceeds gen_batch_size"
-            )
+        assert concurrency <= max_concurrency, (
+            f"config error: concurrency {concurrency} exceeds max {max_concurrency}"
+        )
This also improves the error message by including the actual values for easier debugging.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5445937 and 7d6695c.

📒 Files selected for processing (25)

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-UCX.yaml
tests/integration/defs/perf/disagg/test_configs/wideep/accuracy/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb288_mtp3_ccb-NIXL.yaml
tests/integration/defs/perf/disagg/utils/common.py
tests/integration/defs/perf/disagg/utils/config_validator.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+
Indent Python code with 4 spaces. Do not use tabs
Always maintain the namespace when importing Python modules, even if only one class or function from a module is used
Python filenames should use snake_case (e.g., some_file.py)
Python classes should use PascalCase (e.g., class SomeClass)
Python functions and methods should use snake_case (e.g., def my_awesome_function():)
Python local variables should use snake_case, with prefix k for variable names that start with a number (e.g., k_99th_percentile)
Python global variables should use upper snake_case with prefix G (e.g., G_MY_GLOBAL)
Python constants should use upper snake_case (e.g., MY_CONSTANT)
Avoid shadowing variables declared in an outer scope in Python
Initialize all externally visible members of a Python class in the constructor
For Python interfaces that may be used outside a file, prefer docstrings over comments
Use comments in Python for code within a function, or interfaces that are local to a file
Use Google-style docstrings for Python classes and functions, which can be parsed by Sphinx
Python attributes and variables can be documented inline with the format """<type>: Description"""
Avoid using reflection in Python when functionality can be easily achieved without reflection
When using try-except blocks in Python, limit the except clause to the smallest set of errors possible
When using try-except blocks in Python to handle multiple possible variable types (duck-typing), keep the body of the try as small as possible and use the else block for the main logic

Files:

tests/integration/defs/perf/disagg/utils/config_validator.py
tests/integration/defs/perf/disagg/utils/common.py

**/*.{cpp,cc,cxx,h,hpp,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

All TensorRT-LLM source files (.cpp, .h, .cu, .py, and other source files) should contain an NVIDIA copyright header with the year of latest meaningful modification

Files:

tests/integration/defs/perf/disagg/utils/config_validator.py
tests/integration/defs/perf/disagg/utils/common.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (23)

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

The updated concurrency_list value of 1024 correctly aligns with gen_batch_size × gen_tp_size (32 × 32 = 1024), which should pass the new concurrency validation.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency_list value of 1024 aligns with gen_batch_size × gen_tp_size (64 × 16 = 1024).

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency_list value of 512 aligns with gen_batch_size × gen_tp_size (16 × 32 = 512).

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency values are now all powers of 2, with the max value of 32 matching max_batch_size. This aligns with the broader normalization effort in this PR.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency value of 2048 is a power-of-2 that aligns with the CUDA graph batch sizes configuration (line 72) and is a sensible correction from the previous non-power-of-2 value.

tests/integration/defs/perf/disagg/test_configs/wideep/accuracy/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb288_mtp3_ccb-NIXL.yaml (1)

31-31: LGTM!

The concurrency value update to 2048 aligns with the CUDA graph batch sizes and maintains consistent formatting with other single-value configurations.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx8_gen1_dep32_bs16_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency value of 512 correctly aligns with the effective capacity (max_batch_size × tensor_parallel_size = 16 × 32 = 512) when enable_attention_dp is true. This should pass the new concurrency validation.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM - Concurrency value now aligns with gen config constraints.

The max concurrency is now 1024, which matches gen.max_batch_size (64) × gen.tensor_parallel_size (16) = 1024. This aligns with the new validation logic referenced in the PR.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM - Concurrency value correctly bounded.

Max concurrency 2048 aligns with gen.max_batch_size (128) × gen.tensor_parallel_size (16) = 2048.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx6_gen1_dep16_bs64_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM - Concurrency aligned with gen config.

Max concurrency 1024 matches gen.max_batch_size (64) × gen.tensor_parallel_size (16) = 1024.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp1_ccb-UCX.yaml (1)

24-24: LGTM - Concurrency value correctly bounded.

Max concurrency 2048 aligns with gen.max_batch_size (128) × gen.tensor_parallel_size (16) = 2048.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM - Max concurrency corrected to power-of-2.

The max concurrency is now 32, matching gen.max_batch_size. The previous value of 36 was not a power-of-2 and exceeded the batch size constraint.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen1_dep32_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

Changing concurrency from 1075 to 1024 aligns with the power-of-2 standardization pattern, which is appropriate for GPU workload scheduling and memory alignment.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_8k1k_ctx1_gen3_tep8_bs16_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

Replacing 18 with 16 ensures all concurrency values are powers of 2, matching the max_batch_size: 16 configuration for the gen worker.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen1_dep16_bs64_eplb0_mtp3_ccb-NIXL.yaml (1)

24-24: LGTM!

Adjusting the upper concurrency bound from 1075 to 1024 maintains power-of-2 consistency across the list.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

Replacing 36 with 32 aligns the maximum concurrency value with max_batch_size: 32 and maintains power-of-2 consistency.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-UCX.yaml (1)

24-24: LGTM!

Replacing 36 with 32 aligns the maximum concurrency value with max_batch_size: 32 and ensures all concurrency levels are powers of 2.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

The concurrency_list upper bound is correctly reduced from 36 to 32 to align with the gen_batch_size of 32 (line 54). With enable_attention_dp: false, the new validation logic enforces concurrency <= gen_batch_size, which this change now satisfies.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/Qwen3-235B-A22B-FP4_1k1k_ctx1_gen4_tep8_bs32_eplb0_mtp0_ccb-NIXL.yaml (1)

24-24: LGTM!

Concurrency list properly adjusted to stay within the gen_batch_size limit of 32 when enable_attention_dp is false.

tests/integration/defs/perf/disagg/test_configs/disagg/perf/deepseek-r1-fp4_1k1k_ctx2_gen1_dep16_bs128_eplb0_mtp3_ccb-UCX.yaml (1)

24-24: LGTM!

The concurrency value is correctly reduced from 2150 to 2048 to align with the maximum allowed when enable_attention_dp: true (i.e., gen_batch_size * gen_tp_size = 128 * 16 = 2048).

tests/integration/defs/perf/disagg/utils/common.py (1)

232-232: LGTM!

The parsed concurrency_list is correctly exposed in the returned configuration dictionary for downstream validation.

tests/integration/defs/perf/disagg/utils/config_validator.py (2)

40-41: LGTM!

The new concurrency list validation is correctly integrated into the validation flow.

56-60: LGTM!

The relaxed assertion from == to >= is appropriate and well-documented with the developer confirmation comment.

tests/integration/defs/perf/disagg/utils/config_validator.py

tensorrt-cicd · 2026-01-12T06:34:43Z

PR_Github #31498 [ skip ] triggered by Bot. Commit: 7d6695c

tensorrt-cicd · 2026-01-12T06:34:57Z

PR_Github #31491 [ skip ] completed with state ABORTED. Commit: aa9bfbd

fredricz-20070104 · 2026-01-12T06:50:37Z

/bot skip --comment "Not cover in CI pipelines"

tensorrt-cicd · 2026-01-12T06:55:54Z

PR_Github #31505 [ skip ] triggered by Bot. Commit: 1c42664

tensorrt-cicd · 2026-01-12T06:56:08Z

PR_Github #31498 [ skip ] completed with state ABORTED. Commit: 7d6695c

fredricz-20070104 · 2026-01-12T07:06:34Z

/bot run --skip-test

tensorrt-cicd · 2026-01-12T07:12:19Z

PR_Github #31509 [ run ] triggered by Bot. Commit: 1c42664

tensorrt-cicd · 2026-01-12T07:12:26Z

PR_Github #31505 [ skip ] completed with state ABORTED. Commit: 1c42664

tensorrt-cicd · 2026-01-12T09:53:09Z

PR_Github #31509 [ run ] completed with state SUCCESS. Commit: 1c42664
/LLM/main/L0_MergeRequest_PR pipeline #24360 (Partly Tested) completed with status: 'SUCCESS'

fredricz-20070104 · 2026-01-13T01:27:31Z

/bot reuse-pipeline

tensorrt-cicd · 2026-01-13T01:37:15Z

PR_Github #31646 [ reuse-pipeline ] triggered by Bot. Commit: 1c42664

fredricz-20070104 · 2026-01-13T01:51:24Z

/bot reuse-pipeline

tensorrt-cicd · 2026-01-13T01:57:09Z

PR_Github #31653 [ reuse-pipeline ] triggered by Bot. Commit: d4e83aa

tensorrt-cicd · 2026-01-13T01:57:18Z

PR_Github #31646 [ reuse-pipeline ] completed with state ABORTED. Commit: 1c42664
Can't reuse PR_Github #31509 (Partly Tested) with status: SUCCESS

tensorrt-cicd · 2026-01-13T02:39:55Z

PR_Github #31653 [ reuse-pipeline ] completed with state SUCCESS. Commit: d4e83aa
Reusing PR_Github #31509 (Partly Tested) for commit d4e83aa

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: Daniil Kulko <kulkodaniil@gmail.com>

fredricz-20070104 added 3 commits January 12, 2026 05:49

Fix disagg assertion, add new assertion for concurrency

07e5e0a

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

fix attention dp and batch size failed configuration files

2f7dfd5

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

fx pre-commit error

164fd0a

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

fredricz-20070104 enabled auto-merge (squash) January 12, 2026 06:09

Merge branch 'main' into feature/fix_disagg_asset

aa9bfbd

fredricz-20070104 requested review from ruodil and yingguo-trt January 12, 2026 06:14

ruodil approved these changes Jan 12, 2026

View reviewed changes

yingguo-trt approved these changes Jan 12, 2026

View reviewed changes

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

Merge branch 'main' into feature/fix_disagg_asset

7d6695c

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

tests/integration/defs/perf/disagg/utils/config_validator.py Show resolved Hide resolved

Merge branch 'main' into feature/fix_disagg_asset

1c42664

Merge branch 'main' into feature/fix_disagg_asset

d4e83aa

fredricz-20070104 merged commit bbe535f into NVIDIA:main Jan 13, 2026
5 checks passed

Conversation

fredricz-20070104 commented Jan 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

fredricz-20070104 commented Jan 12, 2026

Uh oh!

coderabbitai bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot commented Jan 12, 2026

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

fredricz-20070104 commented Jan 12, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

fredricz-20070104 commented Jan 12, 2026

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

fredricz-20070104 commented Jan 12, 2026

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

tensorrt-cicd commented Jan 12, 2026

Uh oh!

fredricz-20070104 commented Jan 13, 2026

Uh oh!

tensorrt-cicd commented Jan 13, 2026

Uh oh!

fredricz-20070104 commented Jan 13, 2026

Uh oh!

tensorrt-cicd commented Jan 13, 2026

Uh oh!

tensorrt-cicd commented Jan 13, 2026

Uh oh!

tensorrt-cicd commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fredricz-20070104 commented Jan 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 12, 2026 •

edited

Loading