An OpenAI open model vocabulary guide for developers and researchers
Open-Source Weighted Models
Download
Modify
Share
Use
Refers to models that publicly release their internal parameters (weights). Developers can freely download, modify, and use these models, fostering technological transparency and innovation.
Application in this Document:
gpt-oss-120b and gpt-oss-20b are two open-source weight models. Anyone can access their "brain" parameters.
Inference
The process of using a pre-trained model to process new, unseen data and make predictions or generate content. It's like a student who has finished learning and is now using that knowledge to answer exam questions.
Application in this Document:
The GPT-OSS models are designed to provide powerful inference capabilities, meaning they can efficiently understand and respond to new user requests.
Model Card
A document that provides detailed information about a machine learning model, including its architecture, training data, performance evaluation, intended uses, and potential risks and limitations. It's like a product's "instruction manual."
Application in this Document:
This document that you are reading is itself the model card for the GPT-OSS models.
Mixture of Experts
A neural network architecture. Instead of a single large model handling all tasks, it consists of multiple smaller "expert" networks and a "router." The router selects the most appropriate experts to handle an input, which significantly improves efficiency by activating only a fraction of the parameters.
Application in this Document:
The GPT-OSS models use an MoE architecture. For example, the 120b model has 128 experts, and only 4 are activated for each token.
Quantization
A technique for model compression. It reduces the model's size and memory footprint by lowering the numerical precision of its parameters (weights). This is like representing a number with fewer digits, such as simplifying 3.1415926 to 3.14, which saves storage space.
Application in this Document:
The MoE weights in the GPT-OSS models are quantized, allowing them to run on consumer-grade GPUs.
Grouped-Query Attention
MHA
GQA
An optimized version of the attention mechanism. In standard multi-head attention (MHA), each "query" head has its own "key" and "value" heads, which is computationally expensive. GQA allows multiple query heads to share a single key/value head, significantly reducing computation and memory requirements while retaining most of the performance.
Application in this Document:
The GPT-OSS models use GQA to improve the efficiency of attention calculations.
Fine-tuning
The process of taking a model that has been pre-trained on a large dataset and training it further on a smaller, task-specific dataset. It's like a generalist college graduate receiving specialized job training for a specific position.
Application in this Document:
The document mentions that attackers might fine-tune the model to bypass safety restrictions.
Jailbreak
Refers to the act of designing clever, adversarial prompts to bypass an AI model's safety and content restrictions, causing it to generate content it is not supposed to (e.g., harmful advice).
Application in this Document:
OpenAI evaluated the GPT-OSS models to test their robustness against jailbreaking and found their performance to be comparable to OpenAI 04-mini.
Hallucination
Question:
2+2=?
Answer:
5
Refers to when a language model generates information that seems plausible but is factually incorrect, unsubstantiated, or irrelevant to the context. It's like the model is "confidently spouting nonsense."
Application in this Document:
Due to their smaller scale, the GPT-OSS models are more prone to hallucination than larger, frontier models.
Chain-of-Thought
A technique that prompts an AI model to articulate its "thinking" or reasoning steps before providing a final answer. This makes the model's response more transparent and interpretable, and often leads to more accurate results.
Application in this Document:
The GPT-OSS models provide a complete chain-of-thought, but the document warns that these chains may contain hallucinatory content.
Tool Use
π οΈ
Toolbox
Refers to the model's ability to not only generate text but also call external tools (like a code interpreter, search engine, or calculator) to complete tasks. This greatly expands the model's capabilities, allowing it to access real-time information or perform complex calculations.
Application in this Document:
The models are trained to use a browser tool and a Python tool to enhance their problem-solving abilities.
Benchmark
β
Standardized Test
π
Leaderboard
A standardized set of tests or tasks used to measure and compare the performance of different AI models. It's like using the same practice exam to evaluate the knowledge levels of different students.
Application in this Document:
GPT-OSS was evaluated on several industry-standard benchmarks (like MMLU, SWE-Bench), and its scores are published.