Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs
📰 Hacker News · philipkiely
Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs. 175 comments, 247 points on Hacker News.
Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs. 175 comments, 247 points on Hacker News.