Skip to content

Comments

[None][doc] Add Glm4MoeForCausalLM to model support matrix#11156

Merged
juney-nvidia merged 1 commit intoNVIDIA:mainfrom
venkywonka:venky/add-glm4moe-docs
Jan 31, 2026
Merged

[None][doc] Add Glm4MoeForCausalLM to model support matrix#11156
juney-nvidia merged 1 commit intoNVIDIA:mainfrom
venkywonka:venky/add-glm4moe-docs

Conversation

@venkywonka
Copy link
Collaborator

@venkywonka venkywonka commented Jan 30, 2026

Summary

  • Add Glm4MoeForCausalLM (GLM-4.5, GLM-4.6, GLM-4.7) to the supported models table
  • Add feature support matrix entry for Glm4MoeForCausalLM

Test plan

  • Documentation-only change, no functional changes
  • Model authors to verify feature support matrix accuracy during review

Summary by CodeRabbit

Release Notes

  • Documentation
    • Updated supported models documentation to include GLM-4-MoE model with architecture details and feature support information.

✏️ Tip: You can customize this high-level summary in your review settings.

@venkywonka venkywonka requested a review from a team as a code owner January 30, 2026 23:34
@venkywonka venkywonka requested review from QiJune and kaiyux January 30, 2026 23:34
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 30, 2026

📝 Walkthrough

Walkthrough

Documentation update adding a new model entry Glm4MoeForCausalLM to the supported models reference. The change introduces the model into two existing tables documenting architecture information and feature support matrices.

Changes

Cohort / File(s) Summary
Documentation Update
docs/source/models/supported-models.md
Adds Glm4MoeForCausalLM entry to the Architecture table and Model-Feature Support Matrix, detailing model name, HuggingFace example, and feature support status (Overlaps, Untested, N/A).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically describes the main change: adding Glm4MoeForCausalLM to the model support matrix documentation.
Description check ✅ Passed The PR description follows the template structure with a clear Summary and Test Coverage section explaining the changes and testing approach.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
@venkywonka venkywonka force-pushed the venky/add-glm4moe-docs branch from 12682fe to ea810ed Compare January 30, 2026 23:59
@venkywonka
Copy link
Collaborator Author

venkywonka commented Jan 31, 2026

Feature Support Summary for Glm4MoeForCausalLM

Requesting reviewers to validate the following feature support matrix entry:

Feature Value Evidence/Notes
Overlap Scheduler Yes Standard PyTorch backend feature
CUDA Graph Yes Standard PyTorch backend feature
Attention Data Parallelism Yes Verified: enable_attention_dp handling in modeling_glm.py
Disaggregated Serving Untested Similar architecture to DeepseekV3 but not explicitly verified
Chunked Prefill Yes Uses standard MHA (not MLA), no restrictions
MTP Yes Verified: Glm4MTP class in modeling_speculative.py
EAGLE-3 (One Model Engine) Yes Supports MTP_EAGLE_ONE_MODEL mode (num_nextn_predict_layers=1)
EAGLE-3 (Two Model Engine) No Separate draft model support not implemented
Torch Sampler Yes Standard feature
TLLM C++ Sampler Yes Standard feature
KV Cache Reuse Untested Should work (standard MHA) but not explicitly verified
Sliding Window Attention N/A Uses YARN RoPE, not sliding window
Logits Post Processor Yes Standard feature
Guided Decoding Yes Standard feature

Please confirm or correct any of the above values. The two "Untested" entries are conservative - I could not find explicit tests or documentation confirming these features for this model.

cc: @jhaotingc @symphonylyh @yuxianq @laikhtewari

@venkywonka
Copy link
Collaborator Author

/bot skip --comment "docs only change"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34262 [ skip ] triggered by Bot. Commit: ea810ed

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34262 [ skip ] completed with state SUCCESS. Commit: ea810ed
Skipping testing for commit ea810ed

@juney-nvidia juney-nvidia merged commit 492ed27 into NVIDIA:main Jan 31, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants