Overview
Discord is open-sourcing Osprey, its rule engine for trust and safety operations, in collaboration with ROOST and the internet.dev team. Osprey enables platforms to investigate real-time user activities and deploy dynamic rules to address emerging threats, significantly reducing the engineering overhead typically required to build safety infrastructure from scratch.
How Osprey Works
Osprey processes Actions (events in JSON format) through a series of configurable Rules written in SML (Some Made-up Language), a Python-inspired rule syntax designed to be accessible to non-technical safety teams. The system leverages several core components:
- Rules: Written in SML, rules can reference other rules and external data sources. Example: detecting known spammers based on email patterns and applying automatic labels.
- UDFs (User Defined Functions): Python functions that extend Osprey's capabilities, enabling integrations with external services like spam detection models.
- Features & Entities: Named variables extracted from actions and stored for later analysis. Entities are persistent units (users, servers, emails) that can have effects applied to them.
- Effects: Actions triggered when rules evaluate to true, such as labeling an entity as a spammer or blocking activity.
Key Capabilities
Osprey is designed to handle the scale and speed requirements of large platforms:
- Real-time processing: Handles thousands of events per second via synchronous GRPC or asynchronous message queues
- Rapid rule deployment: New rules take effect in minutes, enabling quick response to emerging threats
- Full transparency: All rule executions and verdicts are logged and indexed in Apache Druid for investigation and debugging
- Extensibility: Support for custom UDFs and validation logic allows teams to adapt to new attack patterns
- Continuous feedback loop: Detection insights feed back into rule improvement
Getting Started
The engine is available on GitHub at github.com/roostorg/osprey. Teams can use Osprey to build stronger safety measures by defining custom rules, integrating external data sources, and analyzing execution outputs through the investigation UI.