Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs

📰 Hacker News · philipkiely

Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs. 175 comments, 247 points on Hacker News.

Published 7 Aug 2025
Read full article → ← Back to Reads