Skip to content

Shelly Palmer: OpenAI releases two open-weight models

Think about this: If you’re building agentic workflows that can’t afford to be throttled, watched or audited, these models let you avoid any ‘imperial entanglements.’
code-unsplash
After five years of locking things behind an endpoint, OpenAI has released gpt oss 120b and gpt oss 20b under an Apache 2.0 license.

The wait is over for OpenAI’s open‑weight models. For the first time since GPT‑2, OpenAI is giving developers access to the raw model weights. No API, no cloud dependency, no rate limits, and no vendor lock-in. You download them. You run them. You own the infrastructure. After five years of locking things behind an endpoint, OpenAI has released gpt‑oss‑120b and gpt‑oss‑20b under an Apache 2.0 license.

You rack these up yourselves. OpenAI says they’re designed for on‑prem or edge environments where regulatory, privacy, or latency constraints make API calls impractical. If you’re building agentic workflows that can’t afford to be throttled, watched, or audited, these models let you avoid any "imperial entanglements."

They’re not multimodal – they don’t handle images or audio – but OpenAI says they can perform long‑form reasoning, generate chain‑of‑thought (CoT) outputs, and support structured responses suitable for agentic workflows like calling tools, browsing, and Python execution.

The company says gpt‑oss‑120b has about 117 billion parameters (5.1 billion active) and, according to OpenAI, runs on a single 80 GB GPU. It reportedly performs similarly to o4‑mini on reasoning tasks. The smaller model, gpt‑oss‑20b, has 21 billion parameters (3.6 billion active) and is optimized to run on systems with as little as 16 GB of memory, which makes it usable on laptops. OpenAI says it performs close to o3‑mini. Both models were quantized using MXFP4, a format OpenAI claims balances size and performance.

Neither model is available via ChatGPT or OpenAI’s API. To use them, you’ll need to host them yourself or work through a supported third party like AWS, which is now offering both models via Amazon Bedrock and SageMaker JumpStart. According to OpenAI, the company does not collect user data from locally run models.

As for safety, OpenAI says its internal Safety Advisory Group tested fine‑tuned versions of the 120b model for biological and cyber misuse and found they did not meet the company’s “High” capability thresholds. The models also fell short of the Preparedness Framework thresholds for self‑improvement or risk‑sensitive domains like chemistry or cybersecurity. In other words, OpenAI tried their best to make the models do terrible things. They were unsuccessful: no cyber apocalypse, no DIY pandemic kits, no self-aware Skynet clones. So they shipped it.

Importantly, the models output raw CoTs, which OpenAI warns may contain hallucinated or unintended text. Developers are advised to filter before presenting to end users.

I started kicking the tires late last night. I'll report back after we put them to work in a business environment. If you want to try them, you can download the model card here, and you can download the models on Hugging Face.

As always your thoughts and comments are both welcome and encouraged. -s

 

Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications and CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named LinkedIn’s “Top Voice in Technology,” he covers tech and business for Good Day New York, is a regular commentator on CNN and writes a popular daily business blog. He's a bestselling author, and the creator of the popular, free online course, Generative AI for Execs. Follow @shellypalmer or visit shellypalmer.com

push icon
Be the first to read breaking stories. Enable push notifications on your device. Disable anytime.
No thanks