OpenAI releases teen safety policy pack for open-weight models; provides developers with prompt-based safeguards

Teen Safety Policies for Open-Weight Models

OpenAI is releasing a collection of prompt-based safety policies to help developers operationalize teen-specific protections in AI systems. These policies are designed to work with OpenAI's open-weight safety model, gpt-oss-safeguard, and can be directly integrated into content filtering and moderation workflows.

The initial policy release covers six critical risk areas:

Graphic violent content
Graphic sexual content
Harmful body ideals and behaviors
Dangerous activities and challenges
Romantic or violent roleplay
Age-restricted goods and services

Addressing Developer Challenges

One of the biggest obstacles developers face is translating high-level safety requirements into precise, operational rules. Even experienced teams struggle to define policies that accurately capture teen-specific risks while avoiding inconsistent enforcement or overly broad filtering. These prompt-based policies address that gap by providing clear, tested foundations that developers can adapt to their specific use cases.

Developed with External Expertise

The policies were developed in collaboration with organizations including Common Sense Media and everyone.ai, incorporating research on teens' developmental differences and unique vulnerabilities. This external input helped shape the scope of coverage and strengthen the policy structure.

Open Source and Iterative

Released as open source through the ROOST Model Community on GitHub, these policies are positioned as a starting point rather than a comprehensive solution. Developers are encouraged to adapt, extend, and contribute improvements based on their specific product contexts and user needs. OpenAI emphasizes that these policies should be combined with additional safeguards including product design choices, user controls, and transparent communications.

Part of Broader Youth Safety Efforts

This release builds on OpenAI's existing teen protection work, including updates to the Model Spec with Under-18 principles, introduction of parental controls in ChatGPT, age prediction features, and the Teen Safety Blueprint. The policies represent a commitment to democratizing safety tools across the open-weights ecosystem.

Teen Safety Policies for Open-Weight Models

Addressing Developer Challenges

Developed with External Expertise

Open Source and Iterative

Part of Broader Youth Safety Efforts

Products

Tags

Published

Source

Related News