[None][doc] Add Glm4MoeForCausalLM to model support matrix#11156
[None][doc] Add Glm4MoeForCausalLM to model support matrix#11156juney-nvidia merged 1 commit intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughDocumentation update adding a new model entry Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
12682fe to
ea810ed
Compare
Feature Support Summary for
|
| Feature | Value | Evidence/Notes |
|---|---|---|
| Overlap Scheduler | Yes | Standard PyTorch backend feature |
| CUDA Graph | Yes | Standard PyTorch backend feature |
| Attention Data Parallelism | Yes | Verified: enable_attention_dp handling in modeling_glm.py |
| Disaggregated Serving | Untested | Similar architecture to DeepseekV3 but not explicitly verified |
| Chunked Prefill | Yes | Uses standard MHA (not MLA), no restrictions |
| MTP | Yes | Verified: Glm4MTP class in modeling_speculative.py |
| EAGLE-3 (One Model Engine) | Yes | Supports MTP_EAGLE_ONE_MODEL mode (num_nextn_predict_layers=1) |
| EAGLE-3 (Two Model Engine) | No | Separate draft model support not implemented |
| Torch Sampler | Yes | Standard feature |
| TLLM C++ Sampler | Yes | Standard feature |
| KV Cache Reuse | Untested | Should work (standard MHA) but not explicitly verified |
| Sliding Window Attention | N/A | Uses YARN RoPE, not sliding window |
| Logits Post Processor | Yes | Standard feature |
| Guided Decoding | Yes | Standard feature |
Please confirm or correct any of the above values. The two "Untested" entries are conservative - I could not find explicit tests or documentation confirming these features for this model.
|
/bot skip --comment "docs only change" |
|
PR_Github #34262 [ skip ] triggered by Bot. Commit: |
|
PR_Github #34262 [ skip ] completed with state |
Summary
Glm4MoeForCausalLM(GLM-4.5, GLM-4.6, GLM-4.7) to the supported models tableGlm4MoeForCausalLMTest plan
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.