OpenAI has developed a pair of new open-weight language models optimized for consumer GPUs. In a blog post, OpenAI announced "gpt-oss-120b" and "gpt-oss-20b", the former designed to run on a single 80GB GPU and the latter optimized to run on edge devices with just 16GB of memory.
Both models take advantage of a Transformer using the mixture-of-experts model, a model that was popularized with DeepSeek R1. Despite gpt-oss-120b and 20b's design focus towards consumer GPUs, both support up to 131,072 context lengths, the longest available for local inference. gpt-oss-120b activates 5.1 billion parameters per token, and gpt-oss-20b activates 3.6 billion parameters per token. Both models use alternating dense and locally banded sparse attention patterns and use grouped multi-query attention with a group size of 8.
Both models take advantage of a Chain-of-Thought reasoning architecture with a mixed focus on reasoning, efficiency, and real-world usability. The two gpt-oss models are also the first open-weight language models since GPT-2. Open AI models are similar to open-source software, providing easier accessibility for developers. OpenAI opted to make its two latest models open-source to boost adoption in emerging markets and other sectors that might lack the capability to adopt its proprietary models.
The gpt-oss-120b model allegedly achieves nearly identical performance with OpenAI's outgoing o4-mini language model on core reasoning benchmarks but is capable of doing all this on a single 80GB GPU. gpt-oss-20b delivers similar performance to OpenAI's outgoing o3-mini language model while being capable of running on devices with just 16GB of memory.
In evaluations OpenAI conducted, gpt-oss-120b outperformed o3-mini and matched or exceeded o4-mini in competition coding, general problem solving, and tool calling. However, 120b was also capable of outperforming o4-mini in health-related queries and competition mathematics. gpt-oss-20b was able to perform the exact same benchmarking behavior against o3-mini.
The two new OpenAI models are available to use now under the Apache 2.0 open-source license. OpenAI has partnered with a plethora of companies to support its latest models on a variety of platforms, including ONNX Runtime, Azure, AWS, and Ollama.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.