Meta has released Llama 4, the latest model in their Llama family, featuring Day 0 inference support through vLLM. The new models, Llama 4 Scout and Llama 4 Maverick, introduce improved multimodal capabilities, featuring a mixture of experts architecture for enhanced compute efficiency and faster inference. These models are designed to offer superior performance in text and image understanding, allowing developers to create sophisticated AI applications. The article discusses the technical specifications of these models and provides a guide on how to get started with inferencing using vLLM.