Skip to content

Comments

[TRTLLM-10857][chore] Move SaveHiddenStates spec dec mode to 1 model#11241

Merged
mikeiovine merged 4 commits intoNVIDIA:mainfrom
mikeiovine:saver-1-model
Feb 20, 2026
Merged

[TRTLLM-10857][chore] Move SaveHiddenStates spec dec mode to 1 model#11241
mikeiovine merged 4 commits intoNVIDIA:mainfrom
mikeiovine:saver-1-model

Conversation

@mikeiovine
Copy link
Collaborator

@mikeiovine mikeiovine commented Feb 3, 2026

Description

Remove another dependency on deprecated 2-model Drafter machinery. Also attempt document the existing behavior in the code.

Test Coverage

Existing tests pass.

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md
and the scripts/test_to_stage_mapping.py helper.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Summary by CodeRabbit

Release Notes

  • API Changes

    • Updated speculative decoding hidden-state handling: new SaveHiddenStatesResourceManager and SaveHiddenStatesSpecMetadata classes replace previous SaveHiddenStatesDrafter API.
  • Improvements

    • Enhanced hidden state capture and persistence mechanism for speculative decoding inference workflows.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

The pull request refactors the speculative decoding hidden states saving mechanism from a Drafter-based to a ResourceManager-based architecture. The SaveHiddenStatesDrafter is replaced with SaveHiddenStatesResourceManager and SaveHiddenStatesSpecMetadata. The post-forward flow now calls the resource manager's process_and_save method instead of the drafter's post-hook, and the drafter's run_drafter_post method is removed.

Changes

Cohort / File(s) Summary
Core Hidden States Implementation
tensorrt_llm/_torch/speculative/save_hidden_state.py
Redesigned from Drafter-based to ResourceManager-based pattern. Introduced SaveHiddenStatesResourceManager (inheriting from BaseResourceManager) with resource lifecycle methods (prepare, update, free, shutdown) and new process_and_save method for post-forward hidden state capture. Added SaveHiddenStatesSpecMetadata dataclass to configure layer capture. Includes internal buffer management, disk I/O via _write_to_file, and per-request processing logic. 171 lines added.
Public API Exports
tensorrt_llm/_torch/speculative/__init__.py
Replaced SaveHiddenStatesDrafter export with SaveHiddenStatesResourceManager and SaveHiddenStatesSpecMetadata in all. Updated imports to reflect new public entities.
Executor Integration
tensorrt_llm/_torch/pyexecutor/py_executor.py
Replaced unconditional drafter.run_drafter_post() call with conditional SPEC_RESOURCE_MANAGER.process_and_save() invocation (gated by not-warmup condition). Passes scheduled_batch and optional spec_metadata from model_engine to new resource manager flow.
Drafter Cleanup
tensorrt_llm/_torch/speculative/drafter.py
Removed run_drafter_post method and its docstring from Drafter class, eliminating the post-drafter hook API.
Utility & Integration Updates
tensorrt_llm/_torch/speculative/utils.py, tensorrt_llm/_torch/speculative/eagle3.py, tensorrt_llm/_torch/speculative/interface.py
Updated get_spec_metadata and get_spec_resource_manager to return new SaveHiddenStatesSpecMetadata and SaveHiddenStatesResourceManager for save-hidden-states mode; renamed fields (num_layers → num_model_layers, eagle3_resource_manager → resource_manager). Removed SaveHiddenStatesDrafter return in get_spec_drafter. Extracted _get_eagle3_default_capture_layers helper. Dropped SAVE_HIDDEN_STATES from has_spec_drafter union condition.
Configuration Logic
tensorrt_llm/llmapi/llm_args.py
Updated SaveHiddenStatesDecodingConfig.model_post_init to handle -1 layer capturing without automatic insertion. Modified num_capture_layers to account for aux_hidden_states presence via indicator int(-1 not in eagle3_layers_to_capture); expanded docstring to clarify tensor saving behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: moving SaveHiddenStates spec decoding mode to a 1-model (single-model) implementation, which is the core objective reflected in the raw summary changes.
Description check ✅ Passed The description provides a concise explanation of what is being done (removing dependency on deprecated 2-model Drafter machinery and documenting behavior) and mentions test coverage, though it lacks some detail about the specific technical changes and their rationale.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Important

Action Needed: IP Allowlist Update

If your organization protects your Git platform with IP whitelisting, please add the new CodeRabbit IP address to your allowlist:

  • 136.113.208.247/32 (new)
  • 34.170.211.100/32
  • 35.222.179.152/32

Reviews will stop working after February 8, 2026 if the new IP is not added to your allowlist.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
tensorrt_llm/llmapi/llm_args.py (2)

1-5: ⚠️ Potential issue | 🟡 Minor

Add required NVIDIA copyright header.

This TensorRT‑LLM source file starts directly with imports; please add the standard NVIDIA header with the latest modification year at the top of the file.
As per coding guidelines: All TensorRT-LLM source files (.cpp, .h, .cu, .py, and other source files) should contain an NVIDIA copyright header with the year of latest meaningful modification.


1051-1067: ⚠️ Potential issue | 🟠 Major

Align SaveHiddenStates validation with the new default-capture behavior.

model_post_init now treats eagle3_layers_to_capture=None as valid, but validate() still rejects falsy values. If validate() is invoked, the default-capture path will fail. Either allow None explicitly or remove the default path/documentation.

🔧 Proposed fix
 def validate(self) -> None:
-    if self.output_directory is None or not self.eagle3_layers_to_capture:
-        raise ValueError(
-            "Save directory and layers to capture must be provided")
+    if self.output_directory is None:
+        raise ValueError("Save directory must be provided")
+    if self.eagle3_layers_to_capture is not None and len(self.eagle3_layers_to_capture) == 0:
+        raise ValueError("layers_to_capture must be non-empty when provided")
tensorrt_llm/_torch/speculative/__init__.py (1)

1-3: ⚠️ Potential issue | 🟡 Minor

Add required NVIDIA copyright header.

Please add the standard NVIDIA header with the latest modification year at the top of this source file.
As per coding guidelines: All TensorRT-LLM source files (.cpp, .h, .cu, .py, and other source files) should contain an NVIDIA copyright header with the year of latest meaningful modification.

tensorrt_llm/_torch/speculative/utils.py (1)

1-3: ⚠️ Potential issue | 🟡 Minor

Add required NVIDIA copyright header.

Please add the standard NVIDIA header with the latest modification year at the top of this source file.
As per coding guidelines: All TensorRT-LLM source files (.cpp, .h, .cu, .py, and other source files) should contain an NVIDIA copyright header with the year of latest meaningful modification.

🤖 Fix all issues with AI agents
In `@tensorrt_llm/_torch/pyexecutor/py_executor.py`:
- Around line 1598-1607: The SaveHiddenStates feature is not validated against
pipeline parallelism causing silent failure when pp_size > 1 because only
_executor_loop calls spec_resource_mgr.process_and_save; either enforce
pp_size==1 in the SaveHiddenStatesDecodingConfig initializer (add an
assertion/validation referencing SaveHiddenStatesDecodingConfig and
llm_args.pp_size) or add the same hook into _executor_loop_pp (check for
ResourceManagerType.SPEC_RESOURCE_MANAGER, getattr(self.model_engine,
'spec_metadata', None), and call spec_resource_mgr.process_and_save for the
final pipeline rank) so SaveHiddenStates always runs or fails loudly when PP is
enabled.

In `@tensorrt_llm/_torch/speculative/save_hidden_state.py`:
- Around line 67-68: The function get_needed_resource_to_completion currently
declares an unused parameter request which triggers lint ARG002; update the
function signature in get_needed_resource_to_completion to rename request to
_request (e.g., def get_needed_resource_to_completion(self, _request:
LlmRequest):) or explicitly remove the variable by adding del request at the
top, so the linter stops flagging the unused parameter while preserving the
existing return behavior.
- Around line 135-171: In SaveHiddenStatesSpecMetadata.__post_init__, the
default capture computation calls
_get_eagle3_default_capture_layers(self.num_layers) which uses the wrong field;
change it to call _get_eagle3_default_capture_layers(self.num_model_layers) so
defaults are based on the actual model layer count, keep the rest of the logic
(sorting layers_to_capture, handling -1 last-layer marker, and setting
num_capture_layers) unchanged and ensure the reference to
SaveHiddenStatesSpecMetadata and method __post_init__ are updated accordingly.
- Around line 1-4: Add the standard NVIDIA copyright header (with the latest
modification year) at the very top of
tensorrt_llm/_torch/speculative/save_hidden_state.py before any imports; ensure
the header matches the project's canonical NVIDIA header text and formatting
used across other TensorRT-LLM source files and includes the appropriate year of
last meaningful modification.
🧹 Nitpick comments (5)
tensorrt_llm/_torch/speculative/eagle3.py (1)

111-112: Add a Google-style docstring for the new helper.

Keeps the new utility compliant and clarifies the tuple semantics.

✍️ Suggested update
 def _get_eagle3_default_capture_layers(num_layers: int):
+    """Return default Eagle3 layer indices to capture.
+
+    Args:
+        num_layers: Total number of layers in the model.
+
+    Returns:
+        Tuple of layer indices to capture.
+    """
     return (1, num_layers // 2 - 1, num_layers - 4)

As per coding guidelines: Use Google-style docstrings for Python classes and functions, which can be parsed by Sphinx.

tensorrt_llm/_torch/speculative/interface.py (1)

135-136: Consider documenting the updated has_spec_drafter behavior.

✍️ Suggested update
     def has_spec_drafter(self):
+        """Return True if this mode uses a spec drafter."""
         return self.is_eagle3() or self.is_draft_target() or self.is_ngram(
         ) or self.is_user_provided() or self.is_mtp_eagle()

As per coding guidelines: Use Google-style docstrings for Python classes and functions, which can be parsed by Sphinx.

tensorrt_llm/_torch/speculative/__init__.py (1)

6-22: Keep save_hidden_state imports namespaced.

The guidelines require module‑namespace imports. Consider importing the module and re‑exporting the symbols.

♻️ Suggested refactor
-from .save_hidden_state import (SaveHiddenStatesResourceManager,
-                                SaveHiddenStatesSpecMetadata)
+from . import save_hidden_state
+
+SaveHiddenStatesResourceManager = save_hidden_state.SaveHiddenStatesResourceManager
+SaveHiddenStatesSpecMetadata = save_hidden_state.SaveHiddenStatesSpecMetadata

As per coding guidelines: Always maintain the namespace when importing Python modules, even if only one class or function from a module is used.

tensorrt_llm/_torch/speculative/utils.py (1)

17-18: Keep save_hidden_state imports namespaced.

To comply with the import‑namespace guideline, import the module and qualify usages.

♻️ Suggested refactor
-from .save_hidden_state import (SaveHiddenStatesResourceManager,
-                                SaveHiddenStatesSpecMetadata)
+from . import save_hidden_state
@@
-        return SaveHiddenStatesSpecMetadata(
+        return save_hidden_state.SaveHiddenStatesSpecMetadata(
@@
-        return SaveHiddenStatesResourceManager(
+        return save_hidden_state.SaveHiddenStatesResourceManager(

As per coding guidelines: Always maintain the namespace when importing Python modules, even if only one class or function from a module is used.

Also applies to: 84-95, 145-151

tensorrt_llm/_torch/speculative/save_hidden_state.py (1)

2-12: Keep internal imports namespaced.

To comply with the namespace‑import guideline, import modules and qualify the base classes.

♻️ Suggested refactor
-from ..pyexecutor.resource_manager import BaseResourceManager
-from .interface import SpecMetadata
+from ..pyexecutor import resource_manager
+from . import interface
@@
-class SaveHiddenStatesResourceManager(BaseResourceManager):
+class SaveHiddenStatesResourceManager(resource_manager.BaseResourceManager):
@@
-class SaveHiddenStatesSpecMetadata(SpecMetadata):
+class SaveHiddenStatesSpecMetadata(interface.SpecMetadata):

As per coding guidelines: Always maintain the namespace when importing Python modules, even if only one class or function from a module is used.

Also applies to: 18-18, 136-136

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34689 [ run ] triggered by Bot. Commit: b70c4ec

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34689 [ run ] completed with state SUCCESS. Commit: b70c4ec
/LLM/main/L0_MergeRequest_PR pipeline #26766 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34813 [ run ] triggered by Bot. Commit: daffa6e

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34813 [ run ] completed with state SUCCESS. Commit: daffa6e
/LLM/main/L0_MergeRequest_PR pipeline #26853 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34978 [ run ] triggered by Bot. Commit: c52e55b

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34978 [ run ] completed with state SUCCESS. Commit: c52e55b
/LLM/main/L0_MergeRequest_PR pipeline #26986 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35120 [ run ] triggered by Bot. Commit: a159e64

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35120 [ run ] completed with state SUCCESS. Commit: a159e64
/LLM/main/L0_MergeRequest_PR pipeline #27113 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35373 [ run ] triggered by Bot. Commit: 33e80d4

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35373 [ run ] completed with state SUCCESS. Commit: 33e80d4
/LLM/main/L0_MergeRequest_PR pipeline #27320 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35518 [ run ] triggered by Bot. Commit: 876761b

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35518 [ run ] completed with state SUCCESS. Commit: 876761b
/LLM/main/L0_MergeRequest_PR pipeline #27432 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details
{"name":"HttpError","status":401,"request":{"method":"PATCH","url":"https://api.github.com/repos/NVIDIA/TensorRT-LLM/issues/comments/3843719469","headers":{"accept":"application/vnd.github.v3+json","user-agent":"octokit.js/0.0.0-development octokit-core.js/7.0.6 Node.js/24","authorization":"token [REDACTED]","content-type":"application/json; charset=utf-8"},"body":{"body":"<!-- This is an auto-generated comment: summarize by coderabbit.ai -->\n<!-- This is an auto-generated comment: failure by coderabbit.ai -->\n\n> [!CAUTION]\n> ## Review failed\n> \n> An error occurred during the review process. Please try again later.\n\n<!-- end of auto-generated comment: failure by coderabbit.ai -->\n\n<!-- walkthrough_start -->\n\n<details>\n<summary>📝 Walkthrough</summary>\n\n## Walkthrough\n\nThe pull request refactors the speculative decoding hidden states saving mechanism from a Drafter-based to a ResourceManager-based architecture. The SaveHiddenStatesDrafter is replaced with SaveHiddenStatesResourceManager and SaveHiddenStatesSpecMetadata. The post-forward flow now calls the resource manager's process_and_save method instead of the drafter's post-hook, and the drafter's run_drafter_post method is removed.\n\n## Changes\n\n|Cohort / File(s)|Summary|\n|---|---|\n|**Core Hidden States Implementation** <br> `tensorrt_llm/_torch/speculative/save_hidden_state.py`|Redesigned from Drafter-based to ResourceManager-based pattern. Introduced SaveHiddenStatesResourceManager (inheriting from BaseResourceManager) with resource lifecycle methods (prepare, update, free, shutdown) and new process_and_save method for post-forward hidden state capture. Added SaveHiddenStatesSpecMetadata dataclass to configure layer capture. Includes internal buffer management, disk I/O via _write_to_file, and per-request processing logic. 171 lines added.|\n|**Public API Exports** <br> `tensorrt_llm/_torch/speculative/__init__.py`|Replaced SaveHiddenStatesDrafter export with SaveHiddenStatesResourceManager and SaveHiddenStatesSpecMetadata in __all__. Updated imports to reflect new public entities.|\n|**Executor Integration** <br> `tensorrt_llm/_torch/pyexecutor/py_executor.py`|Replaced unconditional drafter.run_drafter_post() call with conditional SPEC_RESOURCE_MANAGER.process_and_save() invocation (gated by not-warmup condition). Passes scheduled_batch and optional spec_metadata from model_engine to new resource manager flow.|\n|**Drafter Cleanup** <br> `tensorrt_llm/_torch/speculative/drafter.py`|Removed run_drafter_post method and its docstring from Drafter class, eliminating the post-drafter hook API.|\n|**Utility & Integration Updates** <br> `tensorrt_llm/_torch/speculative/utils.py`, `tensorrt_llm/_torch/speculative/eagle3.py`, `tensorrt_llm/_torch/speculative/interface.py`|Updated get_spec_metadata and get_spec_resource_manager to return new SaveHiddenStatesSpecMetadata and SaveHiddenStatesResourceManager for save-hidden-states mode; renamed fields (num_layers → num_model_layers, eagle3_resource_manager → resource_manager). Removed SaveHiddenStatesDrafter return in get_spec_drafter. Extracted _get_eagle3_default_capture_layers helper. Dropped SAVE_HIDDEN_STATES from has_spec_drafter union condition.|\n|**Configuration Logic** <br> `tensorrt_llm/llmapi/llm_args.py`|Updated SaveHiddenStatesDecodingConfig.model_post_init to handle -1 layer capturing without automatic insertion. Modified num_capture_layers to account for aux_hidden_states presence via indicator int(-1 not in eagle3_layers_to_capture); expanded docstring to clarify tensor saving behavior.|\n\n## Estimated code review effort\n\n🎯 4 (Complex) | ⏱️ ~50 minutes\n\n</details>\n\n<!-- walkthrough_end -->\n\n<!-- pre_merge_checks_walkthrough_start -->\n\n<details>\n<summary>🚥 Pre-merge checks | ✅ 2 | ❌ 1</summary>\n\n<details>\n<summary>❌ Failed checks (1 warning)</summary>\n\n|     Check name     | Status     | Explanation                                                                           | Resolution                                                                         |\n| :----------------: | :--------- | :------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------- |\n| Docstring Coverage | ⚠️ Warning | Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |\n\n</details>\n<details>\n<summary>✅ Passed checks (2 passed)</summary>\n\n|     Check name    | Status   | Explanation                                                                                                                                                                                                                                                                   |\n| :---------------: | :------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n|    Title check    | ✅ Passed | The title clearly and specifically describes the main change: moving SaveHiddenStates spec decoding mode to a 1-model (single-model) implementation, which is the core objective reflected in the raw summary changes.                                                        |\n| Description check | ✅ Passed | The description provides a concise explanation of what is being done (removing dependency on deprecated 2-model Drafter machinery and documenting behavior) and mentions test coverage, though it lacks some detail about the specific technical changes and their rationale. |\n\n</details>\n\n<sub>✏️ Tip: You can configure your own custom pre-merge checks in the settings.</sub>\n\n</details>\n\n<!-- pre_merge_checks_walkthrough_end -->\n\n<!-- finishing_touch_checkbox_start -->\n\n<details>\n<summary>✨ Finishing touches</summary>\n\n<details>\n<summary>🧪 Generate unit tests (beta)</summary>\n\n- [ ] <!-- {\"checkboxId\": \"f47ac10b-58cc-4372-a567-0e02b2c3d479\", \"radioGroupId\": \"utg-output-choice-group-unknown_comment_id\"} -->   Create PR with unit tests\n- [ ] <!-- {\"checkboxId\": \"07f1e7d6-8a8e-4e23-9900-8731c2c87f58\", \"radioGroupId\": \"utg-output-choice-group-unknown_comment_id\"} -->   Post copyable unit tests in a comment\n- [ ] <!-- {\"checkboxId\": \"6ba7b810-9dad-11d1-80b4-00c04fd430c8\", \"radioGroupId\": \"utg-output-choice-group-unknown_comment_id\"} -->   Commit unit tests in branch `saver-1-model`\n\n</details>\n\n</details>\n\n<!-- finishing_touch_checkbox_end -->\n\n<!-- announcements_start -->\n\n> [!TIP]\n> [Issue Planner](https://www.coderabbit.ai/issue-planner) is now in beta. Read the [docs](https://docs.coderabbit.ai/issues/planning) and try it out! Share your feedback on [Discord](https://discord.com/invite/coderabbit).\n\n<!-- announcements_end -->\n\n<!-- tips_start -->\n\n---\n\nThanks for using [CodeRabbit](https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=NVIDIA/TensorRT-LLM&utm_content=11241)! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.\n\n<details>\n<summary>❤️ Share</summary>\n\n- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai)\n- [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai)\n- [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai)\n- [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)\n\n</details>\n\n<sub>Comment `@coderabbitai help` to get the list of available commands and usage tips.</sub>\n\n<!-- tips_end -->\n\n<!-- internal state start -->\n\n\n<!-- DwQgtGAEAqAWCWBnSTIEMB26CuAXA9mAOYCmGJATmriQCaQDG+Ats2bgFyQAOFk+AIwBWJBrngA3EsgEBPRvlqU0AgfFwA6NPEgQAfACgjoCEYDEZyAAUASpETZWaCrKPR1AGxJcA2tBvQADKBALJgAIwADAAcAKwA7AC6Pgyw+BQkiZAh+FKQAMpoUgAS8LRKGPm41NL23KKQSgyQzIokkASQ4S1tHpCQBgByjgKUXOHhAEwALN0DAKo2gVywuLjciBwA9FtE6rDYAhpMzFuDAGoAkgAilwCCW9BkiOkBYMEhW9zYHh5bEzM5gZ5ogxi14ABrEjwXLwcj9Az5fDYCgMdoCKgYVJcRBFSgRMCtJR9QBJhDBnKRcJAMZhsS1tFgBlVqNhNvx6oyDDYSBJ4CQAO5gy4AL2FsiseBoFGYmARAGEMjV6NQuJNIpMAGxgdXagDM0Em4Q4kV1HFi0QAWkZrtIGBR4NxxPgMFxua0pMg0I0SByKgx5M6OrB2kpeKIlZBJoTepBrlQAGZS+mpOGUeRyHq8jBEIPtQolMoVZk0ZCIeoMeDx+AMagwrBNRRwnNE9qdL2IJteaPElDMbheNgYapOjAaGDBxiwTCkdAeF7oNYkPu4ZCdWj4BiOdi5yAkAAeSHE2epJCnvPSKCwTCUY4AogfEEecyWVzw0IhEGO4O1bJeGB5sCUT17GqDBaGcehUlECEPEPSB4wva8m0gIhsDKEhYPIRAABoOmkFdcNgtEMA7bNcPXTdB2HOtcMweg5Uual8CpVk0BnOFHwobAxDrT8jH0YxwCgMh6HweMcAIYgyGUGhIJYKiuF4fhhFEcQPWpeRr2UVR1C0HQBJMKA4FQVBZTQPBCFIcgqFkhRWHYLgqH5exHBlFwNIUJQqB0zRtF0MBDEE0wDBoEj0goXAAH1flOSKCFRWAvlkfdRAsigksilLN3ijRuFkDgDAAImKgwLEgO5Lik6yIwcJx3LEydp2kIxLiwXAJysWR71S+L4PSfkILfdrcPa9puHwR8wAQigBooeg0nwCFGDQX4OnwHdaATJMpxkEgyEgDJ+zQNFaC/Ccw3PVlPPabAsWdWh1DrFaPHkOEJEWuhGi2ygNC4jBIs2tBE0oSLxsfAAKDQoYASnQYG+FGvqZsGuj7DQPtMJzR8fU/cd2nIZyPHwPZmgyd1ana1Aa1W1GOJoNBaFw/lgywDBmMvSBZuYbBuFoxdl1XdbcTyBByn2x8ahkV6h2UHjj35fYd3yKxbzlSKbFvfIAHlFjlW9IpCO5BjuABxW8bC4St0AKFW1Y17Xdf1w3jbNux90PT0wJ4ChYSA63eA3aREEiujIuF9o2HaxRcPUZbfmQAO0Q/EOwLDvFwcQKDaB+OhIoEahUlwstREiyOGeoNBYYV9r2RHFa6hLsvwOqeCfeYHpiUy7NUzO1Bw+QUWKhAyX0GAhDfnwfkwB59AeMDIGk0R6bZvobhqFgXCiAjDNEbZ3AwC5memDAx7nTO9oHAEUEAEdsG3eMiec0asB55vagyW/8OHkt0C9xO9toMgausBdwPifJAAA0ucMANYoK/3oBkF4KI0SQFfiPd8J5pokDHIMda3wBBEXKlYRiHYiAYBZIg9AGRGrZjoBoIwQwtaDFvMYLYmAVqyA7IgQKgRUzIFSE1WgXAADUUQthgGmEYW8j54AylslpA6PI+TORIPGaanBsh0HgI4IqJUDAQDAEYUKLwKARWih4WK8VUhbGLpuDwtYpBbEipFOE6hnG5XyrowqpVLAVSqjJT6tU3IBnEgI2hiB+KQG5EdJOO59zjQioE2QzABD4D6PmEgpQxaVGHNIOMC9KCtxYHhMKpiooxQ0HFdIqQNC2J+A47BzjXFRUipzRWuB+TrQJjwQ4hDEDJNSXOLgGSslFlyYgbkSDUQkBCOw0gfBUYjMLGQYs0h8jlhCCQaozc0Dn17AkqkbNnLfHjrmUEHROn2AGWk5A8Y26QFqXiSKg8yBh1ybRL2iN8GELuMQyAvIvTOOes4lAyA0G2U6PE9IVJEYdK6QKSA5C2C42Mvwqc4SdxEhzoATAJkBQsSfQBwFB4zHXRPIMmsJjxLOyasxA+T4bwMvLgH22cKzHlhZc7p7BHrSFwqo+MqlJAYXkCtKUyE0hPwnOHMALyMBgAljQeCt057kNgrgV6eK9xg0+p0VklBPaiVGnwLFXg+IMKMhOJpGA3GtJNSQXFkAgW/BBbBR8oLUHcDfvQToGQH6qR3FyrV0LPrcvELUVGPqly5FbBONJolyB8pIiiZCzoXoFDxKMlZ4zJnImmbM8h8zGXUrGZLdZohNnbIrlQsavTqxpqKNoexBD2grWdEQDsShQGHmQv0lJNze5oqaquCcdqHXIg6YNX5JCUQkpQcA8ds4xXHjhFKVVbT7THlGLIe6O5ZEHK2Piw56NpD0IYWVO4HgpS1mdILDaoh7E2V4vwcSh7PoXm+dWXcQ4eURP0dYWtzRQ3kqjVIIR6aCw0vGfSpMdzinGPCuUixlSrGwFqeWepalGkuOtS0v9EoCGftDXyT0YswMGH6FAYtWbJY5uQTMuZhTwZ7uDfQWD7dHlSGecs/6CqSDQ3I7ocDmTuO0rLQwCt5cW5Mb7Cxop7Hw5ceyW8mo/GoD4Z+X8ols6bqeqVFwJ1HgQVwhKSYsxFSqkJTQ6lexmHKnYZtYiyekBXUrgKhRoTmack0ekLmtE+a2KUAE5RjNInxliYkzsgTdMSAM2fR50LktoOBZ8dkTAlYv4ADF4BeHKqq2Qwpku8KwjQ0gYHhHTHEZMKRMi5GfQURkXkCL+XQq4KUIgsAvH8WCvBsp5jLHVMSnUmzQqtiA3hh4gqxVvHnsqlZAJhLXLOBCSV5qXIQM6vOhNfeY3tr4EWkQxibHYzfQoFwRGf0AYndBltloWy0jKi9uoZAFFOLIUFNQ4kWzPpHZ62ZixTiUM2PQ8NxxO2fp5RwetZiwY+CjDPDCPg76AP2HgGQih4bqFhNK6eyJFb7uKPJmBgABkocSF2wcUGuxDUEHh4xFyzjnWgkUP530fGyfIDOvC0G5J/NnuFEG+ZIKXBjp3IBa0dE9DwPhaN5pF1kAAvJAXBCbQWRUPtwLgqS0mQEVxllaoJYYBSV86bwROOb/nfMgInSWKBm+M0T37iH+sJSB9Zhpo2TseKJ0YIrtQsd0BEZEcRUxqviFq3JTtDXlG7jUS1zRD0dFTa64YkKzwEN9YBwN13dj3cxaIF4XUE3Oszf8TZQJi36qhPRaQX9UA7ikett0u1YAvBSD6MGDw9Q+DxmVSOR1lJMpsQLwDVR5lL2RRrI6FEQv7HJQoIgcGGBHDRTQHPtkK7YbtWoIo3AKISLW11GAXf/Z2jg3CLhJfzAV9r8gDsSMgnz+IuX7P/VgnpjQ3oVAeYunbK3iHyQXUcLLZSTXZZxMGKKZpEFXVc5SKAfPPYfUnMfKKSfXfDIa/fVDODCeMDQS/dA+fWGaab0ElH4KkF/efKpCfNAKfahaLWLBqL5KPZEZAOETCVsHmLwXCMMUECgLMHMd2GRTdU8IoBHJGfgI1akTEKCM1cwXxS9GSJ9ToRGJoB9a9ffBqV9USPgD9QDb9MNWvSADLXvOsdAUjLgInWArZQffPAAkfYg8fFA6fPAhfXAsg9fIcaGO3NqNPXrGKTPF3IbXPf/QvPKH3PhFbMrDUSrUPWRCMerJRJrWPCKLgTZBPZgTrfRbrbwv7Z3axAIzDLYFdSgbTIvJPEvObMvBbOqZbf3fQ7kWdeKAed8MOcsS7ApBGdaTafAbgJWO4c4fWYoG4a4W8QYSKfIaAO4aATWWuYwo7J7VBa1QMBqY+B6EcFFEdO7RQRzZyDIVA/fZlO+UQv/aw3UXCaDckCgSkXCQYE2GwO4EIXCEEQpKwH2XkJQRmRlEIaAKwSAI4nLVNWQAAbhQEOXWiJloT4GPg7Vf3yF6P6MGOGNGPGMmPyAHXsFgna1wDTXIVMUnmHQvi2Ti2WNPn31ui8k5gQFSGtlsTAAp3dUhPQgyFXkQXYFojkI3RzGPmZW1wficy3ypFDBEmQEDEpjRWghxxSwvSvVWLWjvQt0fRvTiw0P4C0ORyI1WygFwR6QI2aCnRRzR1QL92rzoVCOK39zK0mHEV1GiPD2ukUUaxUUSI0RSO0TSKTwyJT0dwzws1yOB3dwU1lWUxoBKJKjKOkgqJciqMJKNNqJ9HsROh3AuhhCuhtzAHzlBFY0fjaRri9G6Soy8xLBlz8xFy1MIQtw/EvBh0emPCOwACF3wSBCz6MC1ClOgZRmzrZZV5VxlqRsA1FClUYHpEAlpLgtgtYxxLgQTnJml4AVp4ACsgFFZj4qwiByJ1V6hcIAyOwCsPlvUPoMAtgWcv5YJmAnsdzZwiYawf4vRNzuyBBeyBUKBP9IBWouTWVPocyEU8zRMNlgCdlGgK4yzkAM5DhAK4sgDK1qhN91oyBJ8HAbN2gyCYEqCDSFAMBlyURVCY4sQAIHpjxXCKCHCMgNzuMw45ySBcIZQ9xIpcCCAoQSJVzZB1zGUWw+glBqhstcZWp/xAJagwCtt7MWlnNiZP1OgThvhFVECSDlpqCELV9X8AUfigjyT9o95vZfY6AzzOg2ZpRZyCtdBuh0VaBMZIBwZ7E3UyCoKv1RJxJEYXMgTmVjoIRkBcCWKnDGUvIhV6BcDCKZ85L58nz1NCNdCRVyhiTNggtIA5QzL2cQtINvMpkizmyKBwY6zQRGz/N5l+N3NrgALorhlYqS0SxwKQDwZiqdlVMolYzSVRIPB6BI57sFya5ultCDyfM6MwB0g+QZZ6Ap03NBMww140CBc6MF9M5gxs4uc84C5YBDc9BvYlEmD61HM5V+DQoqQLxCiKA116rNjF8nMczCAuieA4zTw41KAsrBNwUhdhrplRrOdc585cBUhZrjrSU0harAt3M7k9pmc2rbrwZDzHwXqYlTqPqKAIrEADhcB1x+QMBwYXr+R7Qf4yYGQe08RCVcl/kZzHVEb1AhcCBIoqwvAIqB9KLfqEqhcmBbpcB4bdA5qdi99kAybcDAaVwSbLDyA6Bc4bq0QCKWAT8RwAaSBedcAXqGatrkBIgIr/5k5Q5w4M57qmdWacIG4GBS5fyK5gafYk5QRkBla4YkwWKkYV4i5hDjxeNkAFLQZtag5fqRbGUu8YQHpqY01cbwFOgiaSAIrrbA5k5WahaRai4Wim5Na6a7JxLag4RxKXFAE+U9wHKxAB5uNv4L4iJeUezstAF+A8BxLGhqxcANy6JTUcAqKbyR40y30sB8KCafLGUV9HxFMKhsM04pBaJuBfRb0W7c5eNva3b8b8BCbssSBaajcu8uFXwu6mdeMZTBylpUZ/wYt59gS/1BgEVNq10eTnIByq1DLMY+qoBnoNwahmKGR2KsAadsCy6f47y+y+BIaqDPofBmbl9aLnhiKlMtz2gAAqJ/K/HypwxICKvuplSgCQeuTkn2X4T6DMJc1HDQPu7DKUMBjwCK8dXOz2oaEBSE/YsQb7e5SKNBvAAGeADIMQdIWQXCQerwa21ReAPcM8vG+Uq8ZEGWcGuvBvQoogJhngfAFdUQ2xW7CCr0ToOEB+O+LEdoZmasEBWVFOz0THZC6fegUyvy4OGuxRoix1MyqKAMuELuvlJ6jQD/P9F8llbiT6bpNckgGBYMBgCEZCZjCKUQr8qDUQRsbMOUZ0Zc+Ysk6AAATRVkijlGKFVggUuGuJlKKF4YQWptkXaEcapH9AXrNXPTZNUNvSUPvWcHScVKDQJWVJLKCvED0MiSistxMLeLMJcfisFwysoBSvrPSpFw8I5k9N8O9MG19PyP9JIt4w8T/RyuqFAoZkqcgCJ2qaKp/KEdKqmZAJaeMzaf+w6ezwwxGx6Y/tyX6agDx02NIXIQNKAoJi+AA2hn3rGYmekCaaSrs0gMinBlgaIC4EKgubpTcdwqIE8bQtR0KgYvqDOwGw0FoCsffqbs/stiHAorQCopZuFtZ1cyZUhehZfr3LcNFu93c3GYKuowLL+sSoC0fIGucGutxekEwNp3p3GsZymqetgGGUVp5zhcQA8IisxYg0KsuZJbqcfKuvJsF1GqwIpboCpcerpA50pa5wZfwiZfRcE1ZeEzipxYpq5Y0G+uJYpv5fJcURFq4ECAsUlaBplagDlc81pSufxdqShphrhovuZYxZebNfmQ0FJqhd5bownxYZpptbDpXUNfOaxfzI5aVZFydY5oAW5pJb5oxi2TrDJbpy1bhZ1b1dhfwltdlftc5eDZluDjlvTgvsFYmvDZFvZ3peTb51VvVqEaeZebKorkKheuVxIF9eNYS0VdqeDZ9p1rtrhdjf51LY0V1eYH1YLvLZDuqCrf9e/PLQ1uqDrbDobabfTaDeubVyRv7qoeHq9aN3nZZYAAEdlQLQLq3ZmdkZmp2hGWngF4I+Rarwr+gMXq6B6fKuBxc64pd8gtkfAV1EgFdjdyAZWMXNyyLwWqRFdIh/3ZXn6r9X6SJgOddIAwOBMMWgXGLvA1pLNkP6g4OUMNABAeTqBwgNRwOjWeahc2z8Xn2JdnQVofBF222kqf3t273ZWXLegnDYPQOF2J2wtj2K47NwCBLnFY3mWUt810s3Usscs7g8sCtwaDBfdB1aEytwh4hwhxF4gNR0iDEjEsinc/CfS3d8i8AOLgzptfFZswyaoK9qjozIlokTr6BWIZwGoXmbdGVfiACHXKAth3PACeOW4152pGqQFcyuPS0/OvRFlQvW26MuX+G8QZVuMuyR5N6+IoAB9bEK2QCuBCCMumhS42gNAkAu7G7XkLb4aL8nNxb98j2z2QDLxsY6DxIfOa3qgzyMgkVagqwMJAEuAXDVGw6WPO5XCgT4CbCSPhckqw7xuyPHW/10uWjpuRdsuLxcvG4Cuiv1nQXxlyutid9Gb4sFXA26P8X6v6ZrLFLjjPOKAgSoRHQxCYc3wqBI5X87yqRZyyEgvEUEVAKnz5uS4KdHJ1tvUJwaQsQQE+S9utrPoXOTs4upAEvskkuf4WwgS2ZF1KB9mhUJDaQQERnjSoBLgZMIowUf8dV1pfUvAxAvvjlVTdDiN8q2XsWjuYvizIvGeA3EAWuvRDoTqDv2W6VYfUYfOrvvOgiufUFcR2IsAMhW9MAqRQepDxTUmpSFD1pMm5Scn1C8nbIkdtSv1iniNccNjCVUd9np8PUvVsujDAw/u1bR20BIYYZduqvkAavxNp2vRCDEZpVOyLbMGTLEzFr5BmvwvYYFmpUumseGx3nUL0KmGnystuuvhsnnu+AOuVtwWdwffEu/eAuN4EyU+vs0/n9VH5dXKHmcDl8AzXDwiZTBuMInCy/WOK/cDq/VGE/r3aBk+nui/EVj0M/dwgjXXpkJuTuKQtwhxFEOvdzFEKbR/C1w/2gqvoeou1lwunzCeDk6Bjm9egNwjLYieqQjsOMhcr78YnMA5XjahSrV+JkM2krcI3eItQ7aDzub/2faUbdcJAAyAhD9q52VwjC97++LYxlAG5C7EOgKHFbBzFt7D9eaM3MEDlyeRn8VaRySHtV1v5XdTuMWc7kAKXbmsZCqWa1AKnE5D1csHCGTiaUNKCIREkRCILEGtKxE2gdpaPM1iSLx4XSmnTIqUmyJbAYoVBeAHwIsQhwLin4PKJNhDJmdS8lnSMksRs5/oXObzJsJ82XJcBWoHcBvvx2aQjQJwRJPvGgOZQxZXwo3XUE4QIoaN0Q0IY8A2wKaqVOSqNY8BEFHhUlOw7QGsOcgh6DkVAxdN2shCXhwh64sjPptYEYKsg00Ipf3nuiIieAQqbxfSspVZhhtToeMC3hGCJgkwCc0aPEhJBYC1hmgIzYknFicGowrq2Q46GJUwCyBkIJwKiGOAUG38bQSEDxl41RxqDWYy+f+q4R0FL8tke+T6MgwOLpDP0qAK6jPwp7+pd4CKcOCeHhzpAxwAAdRZiD9jiZg9RjJXdQNsY4VIF3pAGmDODRgWCBYUsJMGrDH2Fg91KCGHaxw0BOwrwHDROEPtKCMlWGP2Cuiyh9wDlS8M7WoAXgmMQ4cGE4NUrGYHhqjcwc8M3xTg3uDAKmkOFuQXhmYd2FsptkmjaV24QQzGn3HRr2AfQ2TGgC9D2QvZmUDjTVGvDAhk9vQmce0KMHJLSMTM6QeRhfCxHgwUBjKcyKXV6bjJYYEaLpMxBjTb42RJXHjN2RlDyBVKq1FADZQnBkFsRVIVAAAG8IgAAX1PTyd9+kAURPEHETRBGB8iZgVHgSLqJWsqODrG6QEgGBDIVlOLOZEkjlEmB9kIcI5DQDOQgkS2DyFpG8hqBfI+kAKOaKEh2QTyEBQBL9XtLd1qgTjM0RaIACc8QeMJEEjHTB4wGoNALEFiAah4wAgSYGgHjGRBIgtAaIGgAmBRBIxtADUBqGmCxBpgtAaYPECzHxB/IgUC0TULcRlBg4+owUEzhEj1jfREAeaurQuKU1bGTlZTOGJ9EGA5RAmQqEgFsA1kLyUIWgJ43tG4ArAW2OgIVGy765yKE4yGsiFqoziNwEIWwGuPggbjsIE4pAFrCkCmJuMR4klHOE3H9BCoD0bnLdGuAbgqg7JRAHKEHFHj9i94yAI+LKA2Bbo7gTEiQC/HQQfxXEP8QBOfEYAbQlIh0COHAl2NIJd8U8Q+MwhzjLgH4VnO+KPHFR0J/47RshIhCTISCiAI8T4EQ6QBxxTHB8VBDsaDBj0+EkCTlgYkQgfm1Eh8QqlZCoS/xTHQqPEnsSY9nQLEicMUzYleBnA9aL2LYkrDVhno8gICHaHgCjBshMoYzP7i4DuhkILzVWt6CaHNhmBbYLoN2AwgmVSI1hMyR4DD5RsqImFGkZSVQCIwmA1CQQCIB4h5BxhuDegIvwOhOiIywSFbJ+E4l0T/xLYfCbNGtTZhQpdEwqKQxaGoQMgN4k8VxP/GdU9gqqEiUxLYD4SJJJAbxEx0VGET+gtEuKexJykFSnm8ElSZRyvDfiSpd7QqDxIolnYoJjUh8UJPYQjgxJIYW0PaDqlqUr+wEY+BWHORdSRJWABqMzG3yoBRgyEdcPCCFo6TjwApMkViADD1gfQpDCMFGCNqucZQKYayCKi9gUQJ+4COHMIXSBcivYVEBQl/CYCXiAsOg5EO1mBLOZHKpYFgH1PYp9AVAC6b3uWHknNAaAqQa1NTGCmMpRoJDfya+2wSxSBJEUp5lFKbAIympCUr5klKqnHi7xHU9KfaEykrRspzEp5spIGk9TqJxU6iWVIEkVSSZ/418QwFezHhPGT00gGjO4nDheJ643GWlMElaphJqhfCYzOZkclo0VAdiMgHCCxANAuoaIAAFJHJICEyEmjUTVguqmgFISzhIYbZEE71XycgGiCRANAOY+WRoA5n/iBcAEHqU83mGrtGgG4UWXCIRgTge8WIaUieQ/B+Dgw7cToLiHECIB4w8gFyeLICxBg9Zcac2XjMKhIz/xKMmKdHIykBCPAxM3KaTMdlEjswFEymQJkSCETCo2jWwDVPJl1h8JaAOWepyzGRiBAqYyIKIHCDTASAkYtECQHiCTBIxuoaYNMBrkkp4g0QI2ZGI1DhBIxkY+MLqEND5wBUuodMePNiC6hwgAgeINMF1C0B4wsQcILFILnvglxAQTwNjMKixA+y0QTMX3IYCRB1OkQeMJGKiBoBkxtAeeTWBIAahdQzc8IAwAEDNyBAtAQsQWNoBzzqxVY+IKID/nqcNQowcCLEEjGbzCR7JVmcoFIAvkMeK0VZEeJpn/jFmORTpgZxGybViieUVBXzIIDVAPAhhd2bxCPEnE+ZbslVIgHmH7ARZmc9tEePCCUzGphUDBXpywU55umyAjkTUA8QEKwp7C5iCtFIU0LmFlCoRdQtWJ0L2oDCj8UeKqxFS2FHC5ZnkRGwmCBFXANBQ+KIWiLreJEChdHOkW8RZFsAeRU2FanwdWF1E9hTpy9KA51FjiIznOC0U0TCFIikhQYqsWSK4pJim9GYosVZyjxUtZRbYo4X8DuAggmKCIPbRuKdF6CzxWItWJGKqF3iwJRnIUXjAbFAk1RYDjyhZQ0oGUQpTlHwXaKPFxC5JeQvGDGL0l9CzJZYpCU5z+giogwK0u7FQAwwfY0gBPkHHBxOx+gIAA== -->\n\n<!-- internal state end -->"},"request":{"retryCount":1}},"response":{"url":"https://api.github.com/repos/NVIDIA/TensorRT-LLM/issues/comments/3843719469","status":401,"headers":{"access-control-allow-origin":"*","access-control-expose-headers":"ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset","connection":"close","content-security-policy":"default-src 'none'","content-type":"application/json; charset=utf-8","date":"Wed, 11 Feb 2026 16:06:37 GMT","referrer-policy":"origin-when-cross-origin, strict-origin-when-cross-origin","server":"github.com","strict-transport-security":"max-age=31536000; includeSubdomains; preload","vary":"Accept-Encoding, Accept, X-Requested-With","x-content-type-options":"nosniff","x-frame-options":"deny","x-github-media-type":"github.v3; format=json","x-github-request-id":"5013:B611A:C6C961:35AE01B:698CA90D","x-xss-protection":"0"},"data":{"message":"Bad credentials","documentation_url":"https://docs.github.com/rest","status":"401"}}}

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35645 [ run ] triggered by Bot. Commit: c92f8a6

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35645 [ run ] completed with state DISABLED
CI server is currently disabled for unplanned maintenance. Estimated completion time: 8 AM PST on 2/11.

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35808 [ run ] triggered by Bot. Commit: c92f8a6

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35808 [ run ] completed with state SUCCESS. Commit: c92f8a6
/LLM/main/L0_MergeRequest_PR pipeline #27658 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@mikeiovine
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35921 [ run ] triggered by Bot. Commit: d16ac26

@mikeiovine
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35937 [ run ] triggered by Bot. Commit: 0c712f9

@tensorrt-cicd
Copy link
Collaborator

PR_Github #35937 [ run ] completed with state SUCCESS. Commit: 0c712f9
/LLM/main/L0_MergeRequest_PR pipeline #27753 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@mikeiovine
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #36087 [ run ] triggered by Bot. Commit: 5664232

@tensorrt-cicd
Copy link
Collaborator

PR_Github #36087 [ run ] completed with state SUCCESS. Commit: 5664232
/LLM/main/L0_MergeRequest_PR pipeline #27884 completed with status: 'SUCCESS'

Copy link
Collaborator

@zheyuf zheyuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@mikeiovine mikeiovine merged commit fa2bfa5 into NVIDIA:main Feb 20, 2026
5 checks passed
@mikeiovine mikeiovine deleted the saver-1-model branch February 20, 2026 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants