This blog post discusses deployment-ready quantized reasoning models built on the DeepSeek-R1-Distill suite, highlighting their state-of-the-art performance, accuracy recovery in large language models (LLMs), and significant improvements in inference speed. The post details the quantization techniques used, benchmarks against key reasoning tasks, and the models' applicability in real-world scenarios, emphasizing open-source availability and usability. Ultimately, it presents these models as efficient solutions for both low-latency and high-throughput applications without compromising accuracy.