Projects | Hao AI Lab @ UCSD

DistCA

Core Attention Disaggregation for Efficient Long-context Language Model Training

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

Ultra-Fast Diffusion LLM 🚀

Make Video Generation Faster

Making Reasoning Models More Token-Efficient

Evaluating LLM Reasoning through Live Computer Games

Efficient LLM Scheduling by Learning to Rank

Serving Multiple LLMs with Flexible Spatial-Temporal Multiplexing

Consistency Large Language Models: A Family of Efficient Parallel Decoders

Maximizing Goodput in LLM Serving using Prefill-Decode Disaggregation