← Back
Weaviate 1.36 introduces HFresh disk-based vector index for billion-scale deployments
· releasefeatureperformanceapi · weaviate.io ↗

HFresh Vector Index in Technical Preview

Weaviate v1.36 introduces HFresh, a new disk-based vector index inspired by the SPFresh algorithm, designed to scale to billions of vectors while keeping memory usage minimal. Unlike HNSW which requires all vectors in memory, HFresh divides vectors into small regions called postings stored on disk in an LSM store, with only a compact centroid index remaining in memory.

HFresh uses a two-stage search approach: first, an in-memory HNSW index over centroids identifies relevant vector regions, then corresponding postings are fetched from disk for detailed search. This architecture delivers significantly lower memory usage while maintaining predictable latency and I/O characteristics, making it ideal for cost-sensitive, large-scale deployments.

Key HFresh Features

  • Incremental updates without rebuilds — Most updates only affect small regions of vector space. HFresh maintains index quality through incremental rebalancing rather than periodic full rebuilds.
  • Rotational Quantization (RQ) — RQ-8 compresses the centroid index (4x savings) while RQ-1 compresses on-disk postings (32x savings), enabling more vectors per disk read.
  • Tunable performance parameterssearchProbe, replicas, and maxPostingSizeKB allow fine-tuning of recall vs. latency tradeoffs.

HFresh currently supports only cosine and l2-squared distance metrics and is available as a technical preview—the API may change in future releases.

Five Features Reach General Availability

Five previously preview features are now generally available:

  • Server-side Batching — The server controls ingestion flow using persistent connections and dynamic backpressure through exponential moving average (EMA) calculations, eliminating manual batch size tuning.
  • Object TTL — Automatic expiration of objects based on configurable time-to-live policies.
  • Async Replication Improvements — Enhanced replication reliability and performance at scale.
  • Drop Inverted Indices — New schema alteration capability to remove inverted indices and reduce storage overhead.
  • Backup Restoration Cancellation — Ability to cancel ongoing backup restoration operations.

Performance Improvements and Community Contributions

The release includes multiple performance optimizations and bug fixes across the platform, with contributions from the Weaviate community.