[TRTLLM-10666][chore] Refactor request fetching logic for better separation of concerns#10988

lancelly · 2026-01-26T04:56:48Z

Description

Summary

This PR refactors the request fetching and queue management logic, separating responsibilities between ExecutorRequestQueue and PyExecutor. This is the first step of a larger refactoring effort — in follow-up PRs, waiting_queue management will be moved to the new scheduler.

Changes

1. `ExecutorRequestQueue` — Simplified to pure queue operations

Focused on request_queue management only

2. `PyExecutor` — Takes over request processing orchestration

Owns waiting_queue (previously in ExecutorRequestQueue)
Restructured _fetch_new_requests as unified entry point:

Step	Method	Description
1	`_fetch_and_enqueue_requests`	fetch + broadcast + enqueue
2	`_select_requests_for_tp` / `_select_requests_for_dp`	select from waiting queue
3	`_update_new_active_requests_queue_latency`	update performance metrics
4	`merge_requests`	convert to LlmRequest

3. `request_utils.py` — Extracted pure functions

get_from_waiting_queue
merge_requests
attach_py_objects_to_requests

Next Steps

Move waiting_queue management to the new scheduler
Refactor IFB schedule flow to:
1. schedule_local_active — Schedule existing active requests
2. fetch_new_requests_to_waiting_queue — Fetch and enqueue new requests
3. assign_waiting_requests — Assign requests from waiting queue
4. schedule_new_active_requests — Schedule newly activated requests

Summary by CodeRabbit

Release Notes

Refactor
- Simplified request queue architecture and operations for cleaner request handling
- Restructured request scheduling and distribution workflows across distributed ranks
New Features
- Enhanced request scheduling and balancing across tensor-parallel processing units
- Improved latency tracking and performance metrics for queued requests
- New request broadcasting and synchronization capabilities for multi-rank environments
Tests
- Added comprehensive test coverage for request utilities and executor operations

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

lancelly · 2026-01-26T08:11:44Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-26T08:17:56Z

PR_Github #33557 [ run ] triggered by Bot. Commit: 91d0ccd

tensorrt-cicd · 2026-01-26T11:54:01Z

PR_Github #33557 [ run ] completed with state SUCCESS. Commit: 91d0ccd
/LLM/main/L0_MergeRequest_PR pipeline #25884 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

lancelly · 2026-01-26T11:58:48Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-26T12:04:24Z

PR_Github #33586 [ run ] triggered by Bot. Commit: 91d0ccd

tensorrt-cicd · 2026-01-26T12:47:10Z

PR_Github #33586 [ run ] completed with state FAILURE. Commit: 91d0ccd
/LLM/main/L0_MergeRequest_PR pipeline #25907 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

lancelly · 2026-01-26T13:15:06Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-26T13:20:52Z

PR_Github #33596 [ run ] triggered by Bot. Commit: 91d0ccd

tensorrt-cicd · 2026-01-26T18:12:37Z

PR_Github #33596 [ run ] completed with state SUCCESS. Commit: 91d0ccd
/LLM/main/L0_MergeRequest_PR pipeline #25916 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

lancelly · 2026-01-27T01:16:53Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-27T01:23:34Z

PR_Github #33644 [ run ] triggered by Bot. Commit: 91d0ccd

tensorrt-cicd · 2026-01-27T02:32:38Z

PR_Github #33644 [ run ] completed with state SUCCESS. Commit: 91d0ccd
/LLM/main/L0_MergeRequest_PR pipeline #25956 completed with status: 'SUCCESS'

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

lancelly · 2026-01-27T05:18:25Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-27T05:24:50Z

PR_Github #33678 [ run ] triggered by Bot. Commit: 8cedd88

coderabbitai · 2026-01-27T07:46:13Z

📝 Walkthrough

Walkthrough

A major refactoring of the executor request queue system, decoupling complex request handling logic from ExecutorRequestQueue into a new standalone utility module. The ExecutorRequestQueue class is significantly simplified to focus on basic queue operations, while PyExecutor now orchestrates request fetching, validation, and broadcasting through a waiting queue workflow. A new request_utils module provides distributed scheduling, context partitioning, and request merging functionality for various CP/PP/TP configurations.

Changes

Cohort / File(s)	Summary
ExecutorRequestQueue Simplification `tensorrt_llm/_torch/pyexecutor/executor_request_queue.py`	Removed internal data structures for MPI/PP handling, waiting queues, and complex scheduling. Removed constructor parameters `enable_attention_dp` and `max_num_active_requests`. Renamed internal `_get_from_request_queue()` to public `get_from_request_queue()`. Added public accessors: `get_request_queue_size()`, `get_request_queue()`, and `calculate_queue_latency()`. Delegated child request ID generation to external `get_num_child_requests()` utility.
PyExecutor Request Workflow `tensorrt_llm/_torch/pyexecutor/py_executor.py`	Added new request fetching and enqueuing layer with `waiting_queue` deque to hold pending requests before scheduling. Introduced methods: `_fetch_and_enqueue_requests()`, `_pop_from_waiting_queue()`, `_fetch_new_requests()`, `_schedule_attention_dp_requests()`, `_handle_special_queue_items()`, `_update_new_active_requests_queue_latency()`, `_get_new_active_requests_queue_latency()`, `_should_exclude_last_generation_logits()`. Added internal state trackers: `canceled_req_ids`, `control_requests`, `request_accumulated`, `new_active_requests_queue_latency_ms`, `_disable_mpi`, `request_broadcaster`. Updated control flow to route through new waiting queue.
Request Utilities (New Module) `tensorrt_llm/_torch/pyexecutor/request_utils.py`	New comprehensive module for distributed request processing with utilities for: computing child request counts, collecting/attaching Python objects, scheduling attention DP requests with heap-based balancing, context partitioning for Star Attention and Helix CP modes, merging requests into LlmRequest objects, and a RequestBroadcaster class for distributed broadcasting with hang-detection support.
ExecutorRequestQueue Test Updates `tests/unittest/_torch/executor/test_executor_request_queue.py`	Updated tests to reflect simplified constructor signature (removed `enable_attention_dp`, `max_num_active_requests` parameters) and public API rename (`get_from_request_queue()`). Removed tests for legacy internal queue structures and specialized behaviors.
PyExecutor Test Suite (New) `tests/unittest/_torch/executor/test_py_executor.py`	Added MockPyExecutor class and tests for: `_handle_special_queue_items()` behavior with shutdown/cancel/control/normal requests, `update_waiting_queue()` pruning logic, canceled request ID tracking, and queue state queries (size, latency, active counts). Includes fixtures for mock distribution and executor.
Request Utils Test Suite (New) `tests/unittest/_torch/executor/test_request_utils.py`	Comprehensive test suite covering: context partitioning for Star Attention and Helix, request merging with padding/block handling, attention DP scheduling and filtering, request balancing across ranks with heap-based distribution, and integration tests for complete scheduling workflows.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

[TRTLLM-5271][feat] best_of/n for pytorch workflow #5997: Shares code-level connections around child-request generation and executor request-queue logic; this PR refactors/removes internal child-request handling and delegates to request_utils, complementing the prior PR's child-request implementation and propagation.

Suggested reviewers

HuiGao-NV
Funatiq
QiJune
litaotju
nv-guomingz
jaedeok-nvidia

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 61.47% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main change: refactoring request fetching logic for better separation of concerns between ExecutorRequestQueue and PyExecutor.
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering summary, detailed changes across three files, and clear next steps. It provides sufficient context for reviewers.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

tensorrt_llm/_torch/pyexecutor/py_executor.py (1)

1-11: Add the standard NVIDIA copyright header.

This source file should include the NVIDIA header (2026 as the latest modification year).

As per coding guidelines, please add the standard NVIDIA header above the imports.
tensorrt_llm/_torch/pyexecutor/executor_request_queue.py (1)
78-80: Use Optional[List[int]] for Python 3.8+ compatibility.

The pipe union syntax (List[int] | None) requires Python 3.10+, but the project targets Python 3.8+. Replace with Optional[List[int]], which is already imported from typing.
Fix
-    def _generate_child_request_ids(
-            self, request: ExecutorRequest) -> List[int] | None:
+    def _generate_child_request_ids(
+            self, request: ExecutorRequest) -> Optional[List[int]]:

🤖 Fix all issues with AI agents

In `@tensorrt_llm/_torch/pyexecutor/executor_request_queue.py`:
- Around line 1-7: Add the standard NVIDIA copyright/header (with 2026 as the
latest modification year) to the top of the file, placing it above the existing
import block (i.e., above the line that starts with "import dataclasses");
ensure the header matches the project's standard format and includes the
copyright notice, years, and any required license wording.

In `@tensorrt_llm/_torch/pyexecutor/py_executor.py`:
- Around line 2201-2210: The new_active_requests_queue_latency_ms metric is
being accumulated forever because _update_new_active_requests_queue_latency uses
+=; change the behavior so the per-iteration latency isn't cumulative: either
assign the computed latency directly to new_active_requests_queue_latency_ms in
_update_new_active_requests_queue_latency (replace += with =) or have
_get_new_active_requests_queue_latency return the current value and reset
new_active_requests_queue_latency_ms to 0 after reading. Locate the methods
_update_new_active_requests_queue_latency and
_get_new_active_requests_queue_latency and adjust the handling of
new_active_requests_queue_latency_ms (and keep the call to
executor_request_queue.calculate_queue_latency(new_requests, now) intact).
- Around line 9-12: Add Deque to the typing import list (alongside Callable,
Dict, Iterable, List, Optional, Tuple, Union) and replace all uses of
subscripted deque[...] with Deque[...] to keep Python 3.8+ compatibility;
specifically change the class attribute annotated with deque[RequestQueueItem]
and all method parameters typed as deque[RequestQueueItem] (references to
RequestQueueItem and any variables like request_queue) to use
Deque[RequestQueueItem] instead.

In `@tensorrt_llm/_torch/pyexecutor/request_utils.py`:
- Around line 100-108: The sorting currently prioritizes relaxed requests
because sorted(new_requests, key=get_relax_value, reverse=True) reverses the key
order; change the sort so non-relaxed requests come first by either removing
reverse=True or inverting the key logic in get_relax_value (e.g., return not
scheduling_params.attention_dp_relax when py_scheduling_params exists); update
the sorted call that references get_relax_value/new_requests accordingly so
behavior matches the comment.
- Around line 542-547: The call to executor_request_to_llm_request is passing
positional args that misalign parameters (exclude_last_generation_logits is
being used as child_req_ids and ctx_blocks_list becomes the exclude flag);
update the call to use keyword args so arguments are explicit—call
executor_request_to_llm_request(req_id=req_id, exe_req=exe_req,
child_req_ids=<appropriate_child_req_ids>,
exclude_last_generation_logits=exclude_last_generation_logits,
ctx_blocks_list=ctx_blocks_list) (replace <appropriate_child_req_ids> with the
actual child request ids variable) so child_req_ids is passed explicitly and
ctx_blocks_list is used for the context blocks.
- Around line 536-541: The code currently uses a print followed by "assert
False, 'invalid context partition'" which can be disabled under -O; replace this
runtime check with raising a proper exception (e.g., ValueError or RuntimeError)
in the same spot to enforce validation regardless of optimization flags: remove
the assert and raise an error including cp_rank, len(ctx_blocks), and
num_blocks_per_rank to preserve the diagnostic info printed earlier. Locate the
failing check in request_utils.py (the print + assert sequence) and change it to
raise a descriptive exception (e.g., raise ValueError(f"invalid context
partition: rank={cp_rank}, len(ctx_blocks)={len(ctx_blocks)},
num_blocks_per_rank={num_blocks_per_rank}")) so invalid partitions cannot
proceed.
- Around line 1-2: Add the standard NVIDIA copyright header (with 2026 as the
latest modification year) immediately above the module docstring in
tensorrt_llm/_torch/pyexecutor/request_utils.py; place the full multiline header
block before the existing first line (the triple-quoted docstring) so the module
starts with the NVIDIA header followed by the current docstring.

In `@tests/unittest/_torch/executor/test_executor_request_queue.py`:
- Around line 1-8: Add the standard NVIDIA copyright header (updated to 2026)
above the module docstring at the top of the file; insert the multi-line header
comment block before the existing triple-quoted docstring so the file begins
with the NVIDIA header followed by the module docstring that documents
ExecutorRequestQueue tests.

In `@tests/unittest/_torch/executor/test_request_utils.py`:
- Around line 739-741: The test loops use unused loop variables and an ambiguous
Unicode multiplication sign; update the for-loops to use a single underscore for
unused iterators (replace "for rank in range(4):" / "for i in range(5):" with
"for _ in range(...):" or rename the unused variable to "_") and replace the
Unicode "×" character with a plain ASCII identifier or operator (e.g., "x" or
"*" as appropriate) wherever it appears (notably in the blocks referenced around
the loops and the similar occurrence near the other block at lines ~1060-1062)
so Ruff B007 and RUF003 are resolved.

tensorrt_llm/_torch/pyexecutor/executor_request_queue.py

tensorrt_llm/_torch/pyexecutor/py_executor.py

tensorrt_llm/_torch/pyexecutor/request_utils.py

tests/unittest/_torch/executor/test_executor_request_queue.py

tests/unittest/_torch/executor/test_request_utils.py

tensorrt-cicd · 2026-01-27T08:18:32Z

PR_Github #33678 [ run ] completed with state SUCCESS. Commit: 8cedd88
/LLM/main/L0_MergeRequest_PR pipeline #25981 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

Signed-off-by: Liao Lanyu <108499334+lancelly@users.noreply.github.com>

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

…TensorRT-LLM into refactor_request_queue

lancelly · 2026-01-29T09:29:13Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-29T09:34:55Z

PR_Github #34033 [ run ] triggered by Bot. Commit: 9c8ae7b

tensorrt-cicd · 2026-01-29T14:02:03Z

PR_Github #34033 [ run ] completed with state SUCCESS. Commit: 9c8ae7b
/LLM/main/L0_MergeRequest_PR pipeline #26258 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

lancelly · 2026-01-29T14:56:28Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-29T15:07:43Z

PR_Github #34079 [ run ] triggered by Bot. Commit: 9c8ae7b

tensorrt-cicd · 2026-01-29T18:47:51Z

PR_Github #34079 [ run ] completed with state SUCCESS. Commit: 9c8ae7b
/LLM/main/L0_MergeRequest_PR pipeline #26296 completed with status: 'SUCCESS'

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

lancelly · 2026-01-30T09:57:00Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-30T10:02:51Z

PR_Github #34213 [ run ] triggered by Bot. Commit: 19518b2

tensorrt-cicd · 2026-01-30T15:54:35Z

PR_Github #34213 [ run ] completed with state ABORTED. Commit: 19518b2
LLM/main/L0_MergeRequest_PR #26400 (Blue Ocean) completed with status: ABORTED

chzblych · 2026-01-30T16:10:07Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-01-30T16:16:36Z

PR_Github #34232 [ run ] triggered by Bot. Commit: 19518b2

tensorrt-cicd · 2026-01-30T19:23:01Z

PR_Github #34232 [ run ] completed with state SUCCESS. Commit: 19518b2
/LLM/main/L0_MergeRequest_PR pipeline #26405 completed with status: 'SUCCESS'

tensorrt_llm/_torch/pyexecutor/py_executor.py

QiJune

LGTM

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

lancelly · 2026-02-01T14:47:30Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-02-01T14:54:00Z

PR_Github #34362 [ run ] triggered by Bot. Commit: 21742cf

tensorrt-cicd · 2026-02-01T20:07:36Z

PR_Github #34362 [ run ] completed with state SUCCESS. Commit: 21742cf
/LLM/main/L0_MergeRequest_PR pipeline #26508 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

lancelly · 2026-02-01T23:44:43Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-02-01T23:51:10Z

PR_Github #34375 [ run ] triggered by Bot. Commit: 21742cf

tensorrt-cicd · 2026-02-02T02:18:27Z

PR_Github #34375 [ run ] completed with state SUCCESS. Commit: 21742cf
/LLM/main/L0_MergeRequest_PR pipeline #26521 completed with status: 'SUCCESS'
Pipeline has performance regression cases. Check the performance regression report for details.

lancelly added 6 commits January 25, 2026 07:37

refactor request queue

21348d6

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

refactor request queue

a200d08

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

pre-commit

97cbfd1

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

refactor test

2da2e7f

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

refactor fetch and select requests

3fe9982

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

pre-commit check

3300ecc

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

lancelly changed the title ~~[chore][None] Refactor request queue~~ [TRTLLM-10666][chore] Refactor request queue Jan 26, 2026

lancelly changed the title ~~[TRTLLM-10666][chore] Refactor request queue~~ [TRTLLM-10666][chore] Refactor request fetching logic for better separation of concerns Jan 26, 2026

pre-commit check

91d0ccd

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

remove redudant comments and fix update metircs

8cedd88

Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>

lancelly marked this pull request as ready for review January 27, 2026 07:33

lancelly requested a review from a team as a code owner January 27, 2026 07:33

lancelly requested a review from Naveassaf January 27, 2026 07:33

coderabbitai bot reviewed Jan 27, 2026

View reviewed changes

lancelly added 5 commits January 29, 2026 00:47

merge main

0975949

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

fix comments

6ec81a5

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

Merge branch 'main' into refactor_request_queue

966673c

Signed-off-by: Liao Lanyu <108499334+lancelly@users.noreply.github.com>

fix merge conflicts

fd3b44f

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

Merge branch 'refactor_request_queue' of https://github.com/lancelly/…

9c8ae7b

…TensorRT-LLM into refactor_request_queue

fix conflicts

19518b2

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

QiJune reviewed Feb 1, 2026

View reviewed changes

tensorrt_llm/_torch/pyexecutor/py_executor.py Outdated Show resolved Hide resolved

QiJune approved these changes Feb 1, 2026

View reviewed changes

fix comments

21742cf

Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>

litaotju approved these changes Feb 2, 2026

View reviewed changes

lancelly merged commit fef0e4b into NVIDIA:main Feb 2, 2026
5 checks passed

coderabbitai bot mentioned this pull request Feb 24, 2026

[TRTLLM-11106][chore] Abstract ADPRouter interface and RankState #11633

Open

Comments

Conversation

lancelly commented Jan 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary

Changes

1. ExecutorRequestQueue — Simplified to pure queue operations

2. PyExecutor — Takes over request processing orchestration

3. request_utils.py — Extracted pure functions

Next Steps

Summary by CodeRabbit

Release Notes

Uh oh!

lancelly commented Jan 26, 2026

Uh oh!

tensorrt-cicd commented Jan 26, 2026

Uh oh!

tensorrt-cicd commented Jan 26, 2026

Uh oh!

lancelly commented Jan 26, 2026

Uh oh!

tensorrt-cicd commented Jan 26, 2026

Uh oh!

tensorrt-cicd commented Jan 26, 2026

Uh oh!

lancelly commented Jan 26, 2026

Uh oh!

tensorrt-cicd commented Jan 26, 2026

Uh oh!

tensorrt-cicd commented Jan 26, 2026

Uh oh!

lancelly commented Jan 27, 2026

Uh oh!

tensorrt-cicd commented Jan 27, 2026

Uh oh!

tensorrt-cicd commented Jan 27, 2026

Uh oh!

lancelly commented Jan 27, 2026

Uh oh!

tensorrt-cicd commented Jan 27, 2026

Uh oh!

coderabbitai bot commented Jan 27, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tensorrt-cicd commented Jan 27, 2026

Uh oh!

lancelly commented Jan 29, 2026

Uh oh!

tensorrt-cicd commented Jan 29, 2026

Uh oh!

tensorrt-cicd commented Jan 29, 2026

Uh oh!

lancelly commented Jan 29, 2026

Uh oh!

tensorrt-cicd commented Jan 29, 2026

Uh oh!

tensorrt-cicd commented Jan 29, 2026

Uh oh!

lancelly commented Jan 30, 2026

Uh oh!

tensorrt-cicd commented Jan 30, 2026

Uh oh!

tensorrt-cicd commented Jan 30, 2026

lancelly commented Jan 26, 2026 •

edited by coderabbitai bot

Loading

1. `ExecutorRequestQueue` — Simplified to pure queue operations

2. `PyExecutor` — Takes over request processing orchestration

3. `request_utils.py` — Extracted pure functions