DistCA

Image

Core Attention Disaggregation for Efficient Long-context Language Model Training

JacobiForcing

Image

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

d3LLM

Image

Ultra-Fast Diffusion LLM 🚀

Dynasor

Image

Making Reasoning Models More Token-Efficient

LMGame Bench

Image

Evaluating LLM Reasoning through Live Computer Games

vLLM-LTR

Image

Efficient LLM Scheduling by Learning to Rank

MuxServe

Image

Serving Multiple LLMs with Flexible Spatial-Temporal Multiplexing

CLLM

Image

Consistency Large Language Models: A Family of Efficient Parallel Decoders

DistServe

Image

Maximizing Goodput in LLM Serving using Prefill-Decode Disaggregation