Cloudflare adds approximate aggregation functions to R2 SQL for faster large-dataset queries

New Approximate Aggregation Functions

Cloudflare has expanded R2 SQL's query capabilities with five new approximate aggregation functions designed to speed up analysis of large datasets. These functions use probabilistic algorithms to deliver fast results while accepting minimal precision trade-offs—ideal for exploratory analysis and dashboards where exact values aren't critical.

Available Functions

APPROX_PERCENTILE_CONT — Calculate percentiles on numeric data (integer and decimal columns)
APPROX_PERCENTILE_CONT_WITH_WEIGHT — Weighted percentile calculations where rows contribute based on a weight column
APPROX_MEDIAN — Quick median calculation (shorthand for the 0.5 percentile)
APPROX_DISTINCT — Estimate unique value counts across any column type
APPROX_TOP_K — Identify the k most frequent values with counts as JSON

Flexible Filtering and Grouping

All functions support WHERE clause filtering for conditional analysis. Most support GROUP BY operations, enabling per-group analysis (e.g., median revenue by department). You can also combine approximate and standard aggregations in a single query for hybrid analysis workflows.

Use Cases

These functions are particularly valuable for analyzing large tables where cardinality is high—think estimating unique customers across regions, calculating percentile-based metrics on revenue data, or finding top-k patterns in user behavior. The performance improvements make them suitable for interactive queries and real-time dashboards that don't require exact precision.

New Approximate Aggregation Functions

Available Functions

Flexible Filtering and Grouping

Use Cases

Products

Tags

Published

Source

Related News