Lights, Camera, Inference! Video Generation as a Service With VLLM-O... Ricardo Noriega & Doug Smith

Name: Lights, Camera, Inference! Video Generation as a Service With VLLM-O... Ricardo Noriega & Doug Smith
Uploaded: 2026-04-20T20:22:24Z
Channel: PyTorch
Description: Lights, Camera, Inference! Video Generation as a Service With VLLM-Omni - Ricardo Noriega, Red Hat & Doug Smith, Red Hat, Inc LLMs made for text generat...

PyTorch · Intermediate ·🎨 Image & Video AI ·2w ago

Skills: AI Video Generation90%

Lights, Camera, Inference! Video Generation as a Service With VLLM-Omni - Ricardo Noriega, Red Hat & Doug Smith, Red Hat, Inc LLMs made for text generation as a service. What does it take to do the same for video? We built an experimental Video Generation as a Service stack using vLLM-Omni and the LTX-2 open weights video model to explore how far an open, multimodal stack can go toward production use. We’ll share what worked, what busted, and what it takes to treat generative video as a first-class workload. vLLM is known for high-performance autoregressive inference, and vLLM-Omni extends that foundation to multimodal inputs and outputs. We pushed those capabilities further by adding support for LTX-2, extending the OpenAI-compatible API surface, integrating with front ends, and packaging for scalable deployment. We’re here to walk you through and get you familiar with the touch points for just how we put all the Legos together with vLLM-Omni. Finally, we’ll examine the gap between novelty demos and real applications: going from quirky spaghetti eating videos to generating consistent characters, personalized media, customized video game cutscenes, and interactive storytelling, and highlight what’s still missing to make generative video truly production-ready.

Watch on YouTube ↗ (saves to browser)