[None][infra] Add source code pulse scan to PLC nightly pipeline#10961
Conversation
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR introduces vulnerability scanning and reporting capabilities to a Jenkins pipeline. It adds three new functions for OAuth token retrieval, Poetry lock file generation, and Pulse OSS scanning, alongside a Python script for formatting and submitting vulnerability reports to Slack. The pipeline stages are restructured from a single lock-generation step into discrete Prepare Environment, Generate Lock Files, and Run Pulse Scanning phases with parameterized git operations. Changes
Sequence DiagramsequenceDiagram
participant Jenkins as Jenkins Pipeline
participant Env as Environment
participant Git as Git Repo
participant Pulse as Pulse Scanner
participant Slack as Slack Webhook
Jenkins->>Env: Prepare Environment<br/>(Setup variables)
activate Env
Env-->>Jenkins: Environment ready
deactivate Env
Jenkins->>Git: Generate Lock Files<br/>(Poetry, dependency install)
activate Git
Git-->>Jenkins: Lock files generated
deactivate Git
Jenkins->>Pulse: getPulseToken()
activate Pulse
Pulse-->>Jenkins: OAuth token
deactivate Pulse
Jenkins->>Pulse: pulseScan(token, credentials)
activate Pulse
Pulse->>Pulse: Run OSS vulnerability scan
Pulse-->>Jenkins: nspect_scan_report.json
deactivate Pulse
Jenkins->>Jenkins: submit_vulnerability_report.py
activate Jenkins
Jenkins->>Jenkins: Read & filter by severity
Jenkins->>Jenkins: Format vulnerability entries
deactivate Jenkins
Jenkins->>Slack: POST formatted report
activate Slack
Slack-->>Jenkins: Webhook success
deactivate Slack
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@jenkins/scripts/submit_vulnerability_report.py`:
- Around line 55-58: The requests.post call that sends the Slack webhook
(requests.post(SLACK_WEBHOOK_URL, json=payload)) has no timeout which can hang
the pipeline; update that call to include a reasonable timeout (e.g.,
timeout=10) so the request fails fast if Slack is unresponsive, keeping the
existing resp.raise_for_status() behavior intact and ensuring the
SLACK_WEBHOOK_URL and payload variables are still used.
In `@jenkins/TensorRT_LLM_PLC.groovy`:
- Around line 84-96: The getPulseToken function uses an undeclared variable
AuthHeader and interpolates credentials directly into the shell command, risking
credential exposure; declare AuthHeader (e.g., def AuthHeader) before use, build
the auth header without echoing credentials to the shell (use a heredoc or pass
the client id/secret via environment variables provided by withCredentials and
construct the base64 in a sh command that does not print commands, e.g., disable
shell tracing with set +x or use Jenkins masking), and keep the withCredentials
block (usernamePassword) intact; also validate connectivity to the token
endpoint from the build environment before relying on the call.
- Around line 34-52: Remove the duplicated qosClass entry and ensure the docker
container's resource requests equal its limits so the pod can be Guaranteed;
specifically, delete the second qosClass declaration and update the docker
container's resources.requests.memory to match resources.limits.memory (or set
both memory request and memory limit to the same value), and verify cpu and
ephemeral-storage request/limit parity in the same docker container block (look
for the docker container definition, resources.requests, resources.limits, and
qosClass).
🧹 Nitpick comments (4)
jenkins/scripts/submit_vulnerability_report.py (2)
11-22: Wrap module-level execution in amain()function.Executing argument parsing and file I/O at module level prevents reusability and testability. Consider wrapping in a
main()function with anif __name__ == "__main__":guard. Additionally, file reading lacks error handling for missing or malformed JSON files.Suggested structure
def main(): parser = argparse.ArgumentParser() parser.add_argument("--build-url", required=True, help="Jenkins build URL") args = parser.parse_args() slack_webhook_url = os.environ.get("TRTLLM_PLC_WEBHOOK") if not slack_webhook_url: raise EnvironmentError("Environment variable 'TRTLLM_PLC_WEBHOOK' is not set!") input_file = Path("./nspect_scan_report.json") if not input_file.exists(): raise FileNotFoundError(f"Report file not found: {input_file}") vulnerabilities = json.loads(input_file.read_text()) # ... rest of the logic if __name__ == "__main__": main()
38-39: Use snake_case for variable names.Per coding guidelines, Python local variables should use snake_case.
shortTermVersionandlongTermVersionshould be renamed.Proposed fix
- shortTermVersion = safe(v.get("Upgrade-Guidance", {}).get("Short-Term")) - longTermVersion = safe(v.get("Upgrade-Guidance", {}).get("Long-Term")) + short_term_version = safe(v.get("Upgrade-Guidance", {}).get("Short-Term")) + long_term_version = safe(v.get("Upgrade-Guidance", {}).get("Long-Term")) lines = [ f"🔴 *{safe(v.get('Severity'))}* — *{safe(v.get('Package Name'))}* `{safe(v.get('Package Version'))}`", f"• *CVE:* {safe(v.get('Related Vuln'))} | *BDSA:* {safe(v.get('CVE ID'))}", f"• *Score:* {safe(v.get('Score'))}", f"• *Status:* {safe(v.get('Status'))}", f"• *Published:* {safe(v.get('Vulnerability Published Date'))}", - f"• *Upgrade:* `{shortTermVersion}` → `{longTermVersion}`", + f"• *Upgrade:* `{short_term_version}` → `{long_term_version}`", "─" * 40, # separator line ]jenkins/TensorRT_LLM_PLC.groovy (2)
155-198: Token passed via environment variable may appear in logs.The
PULSE_BEARER_TOKENis set inwithEnv(), which may log its value. Consider usingwithCredentialswith amaskPasswordsapproach or Jenkins' built-in secret masking to prevent token exposure in build logs.Also, the pipeline URL construction at line 188 may break if
JOB_NAMEcontains special characters beyond/. Consider usingjava.net.URLEncoderfor robust encoding.
238-238:cdcommand has no effect in this context.
sh "cd ${env.WORKSPACE}"in a Jenkins pipeline step does not persist—eachshstep runs in its own shell. If the intent is to ensure subsequent steps run in the workspace, this line is a no-op and can be removed. The checkout already places files inWORKSPACE.Proposed fix
stage("Prepare Environment"){ steps { checkoutSource() - sh "cd ${env.WORKSPACE}" } }
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
b8d3f0d to
76524e1
Compare
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
e64fcef to
1a6aa46
Compare
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
|
/bot skip --comment "No need to run CI" |
|
PR_Github #33636 [ skip ] triggered by Bot. Commit: |
|
PR_Github #33636 [ skip ] completed with state |
4524662 to
76980ad
Compare
1e5501d to
1933583
Compare
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
1933583 to
75f74de
Compare
|
/bot skip --comment "no ci needed" |
|
PR_Github #34108 [ skip ] triggered by Bot. Commit: |
|
PR_Github #34108 [ skip ] completed with state |
c76a55a to
75f74de
Compare
…DIA#10961) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
…DIA#10961) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.
Description
Add source code vulnerability scanning in TRTLLM PLC nightly pipeline
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]to print this help message.See details below for each supported subcommand.
Details
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-listparameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.mdand the
scripts/test_to_stage_mapping.pyhelper.kill
killKill all running builds associated with pull request.
skip
skip --comment COMMENTSkip testing for latest commit on pull request.
--comment "Reason for skipping build/test"is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipelineReuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.