Scaling DeepSeek-style MoEs with vLLM and llm-d using Wide EP

1 · Red Hat · Sept. 8, 2025, 2:02 p.m.

Summary

NoteThe contents of this article were discussed in detail during our biweekly vLLM office hours virtual event: vLLM Office Hours #29] Scaling MoE with llm-d. View the slides here and register for our future vLLM office hours, happening every other Thursday.Support for the DeepSeek family of models and similar architectures in vLLM marks a significant advancement in the ability to efficiently serve large-scale Mixture of Experts (MoE) language models. These models bring a number of innovations to...

Read full post on developers.redhat.com