← Back
OpenAI
OpenAI releases GABRIEL, an open-source toolkit for quantifying qualitative research data
OpenAI APIOpenAI · open-sourcefeaturereleaseapi · openai.com ↗

What is GABRIEL?

GABRIEL is an open-source Python toolkit designed to help economists, social scientists, and data scientists systematically analyze qualitative data. The tool leverages GPT to transform unstructured text and images into quantitative measurements, addressing a major bottleneck in social science research: the time-consuming manual process of converting rich qualitative data into analyzable, quantitative form.

How it works

Researchers describe what they want to measure using everyday language (e.g., "how family-friendly is this job listing?"), and GABRIEL automatically applies that same question across thousands or millions of documents, returning scores for each one. This shift from manual labeling to automated, consistent measurement lets researchers focus on higher-value work: choosing what to measure, validating results, and drawing conclusions.

Use cases and capabilities

GABRIEL can analyze scientific papers to track methodological trends over time, measure how course curricula allocate attention to different subjects, extract historical details from town records across Europe, and identify patterns in customer reviews. Beyond measurement, the toolkit provides practical research tools including dataset merging (even with mismatched columns), deduplication, passage coding, theory generation assistance, and text deidentification to preserve privacy.

Availability and next steps

The toolkit is now available as an open-source Python library on GitHub with a tutorial notebook. It's designed to require minimal technical background. Researchers can get started immediately and provide feedback to drive future improvements. The accompanying research paper benchmarks GPT's accuracy at labeling qualitative data across diverse use cases, demonstrating its effectiveness for research applications.