Expanding Global Benchmarking for Legal AI
Harvey has introduced BigLaw Bench: Global (BLB: Global), an extension of its existing BigLaw Bench framework that measures how well AI models perform on core legal tasks across different jurisdictions. This new dataset represents a significant expansion of Harvey's benchmarking capabilities, more than doubling the size of the public-facing BigLaw Bench.
Built for Global Legal Practice
BLB: Global was developed in collaboration with Mercor, a leading expert data company, to capture the nuances of legal work across different jurisdictions. The initial release focuses on three markets:
- United Kingdom
- Australia
- Spain
The framework is designed to help foundation models understand and execute core AI workflows—such as drafting, document analysis, and legal research—with accuracy and consistency while respecting local legal norms and practices.
Addressing Localization Challenges
While leading AI models now achieve around 90% performance on the original BigLaw Bench tasks, Harvey's research identified a critical gap: model performance often degrades when applied to localized legal contexts. BLB: Global addresses this by benchmarking six major task categories that customers depend on, including:
- Drafting: Production of legal documents, memos, and contracts
- Additional core legal workflows (details partially cut off in original content)
Implications for Legal Teams
This benchmark extension enables Harvey to measure and improve how well AI models localize across jurisdictions, helping ensure customers worldwide can rely on accurate, consistent AI assistance aligned with their local legal requirements. The framework provides a foundation for expanding to additional jurisdictions over time.