Learn vLLM: Troubleshooting Deepseek R1 8B GPU OOM on single L4 GPU

Samos123 · Beginner ·🧠 Large Language Models ·1y ago
Learn vLLM by troubleshooting errors and figuring out how to tweak the vLLM engine arguments to solve a GPU Out of Memory issue. vLLM was deployed on K8s using KubeAI: https://github.com/substratusai/kubeai
Watch on YouTube ↗ (saves to browser)
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)