Knowledge Store for AI

Build a real-time knowledge store for generative AI.

Store vectors, full-text, and structured data in one database. Retrieve with hybrid search and fusion re-ranking. Serve fresh, multi-modal context to LLMs and agents in milliseconds.

Real-Time Fresh Hybrid Search One Unified Store

Book architecture review Start Free Trial

The Real-Time AI Context Stack: Data Sources flow through a Context Pipeline into VeloDB Knowledge Store which serves AI Applications. Key metrics: 94% Relevance, 65ms Latency, 20x Less RAM.

Trusted by AI-forward engineering teams

ByteDance Meituan Baidu Horizon Robotics JD.COM Xiaomi

65ms

hybrid search p99

94%

retrieval relevance

20x

memory reduction

10M/s

real-time ingest

The knowledge store that keeps your AI accurate, fast, and current.

Real-time fresh context

CDC, Kafka streaming, and HTTP ingest at 10M rows/s. New data searchable in ~1 second. No more stale embeddings causing confident wrong answers.

94% retrieval relevance

Hybrid Search fuses vector similarity, BM25 keyword matching, and SQL filters with Reciprocal Rank Fusion. 94% relevance vs 58% with pure vector search.

One store for every data type

Vectors, full-text, structured tables, and semi-structured JSON in one database. Replace Pinecone + Elasticsearch + PostgreSQL with one query path.

Use cases

Where teams deploy VeloDB knowledge stores.

RAG for enterprise knowledge

Ground LLM answers in fresh, internal documents

Policy docs, support tickets, product specs — chunked, embedded, and searchable via hybrid search. The LLM retrieves current context, not a 6-month-old PDF snapshot.

AI training data management

Manage billions of multimodal training samples

AISpeech manages 10B+ samples across 500TB with millisecond version switching, 80% storage reduction, and full data lineage for model reproducibility.

Multimodal search for autonomous systems

Search text, vectors, labels, and metadata in one query

Horizon Robotics queries nearly 1 trillion records across four search modes — text, vector embeddings, bitmap labels, and JSON metadata — on a single engine.

Agent-facing context retrieval

Serve real-time context to AI agents via MCP

Agents query VeloDB directly through the MCP Server protocol. Hybrid search returns structured + semantic context in sub-100ms for agent reasoning loops.

Where VeloDB sits

The real-time AI context stack.

Raw data enters through CDC, streaming, or HTTP push. An embedding pipeline like CocoIndex transforms, chunks, and embeds your documents with incremental processing -- only changed documents are reprocessed. VeloDB stores everything in one engine and serves AI applications through hybrid search via MCP Server, MySQL protocol, or REST API.

How VeloDB delivers

Progressive filtering. Hybrid search. Fusion re-ranking.

VeloDB doesn't run three searches in parallel and hope for the best. It progressively narrows the candidate set at each stage -- structured filtering first, keyword matching second, vector search third. Expensive operations run on a tiny, pre-filtered subset.

Progressive filtering pipeline: SQL filters narrow millions to thousands, BM25 scores keyword relevance, Vector ANN finds semantic matches, RRF combines rankings.

Structured Pre-Filtering (~50ms)

B-tree indexes, zone maps, and partition pruning apply hard constraints -- location, date, category, access permissions. The candidate set drops from millions to thousands before any search operation begins.

BM25 Full-Text (~200ms)

Inverted indexes with global BM25 scoring find exact keyword matches. Unlike per-segment scoring (which caused ranking instability at ByteDance), VeloDB calculates statistics across the entire table. Supports English, Chinese, and multilingual tokenizers.

Vector Similarity + Fusion (~150ms)

HNSW indexes run semantic search on the pre-filtered subset. IVPQ compresses 768-dim vectors from 3KB to 8 bytes (384x ratio). Reciprocal Rank Fusion combines rankings from all three stages: documents that score well across multiple signals rank highest.

Total: 65ms on 28.7M records. ~400ms at billion scale. 94% relevance.

Customer proof

Production results. Not benchmarks.

ByteDance

ByteDance needed to search 1 billion+ vectors for talent matching. Pure vector search delivered only 58% relevance -- a recruiter searching for Python developers in San Francisco got candidates from Seattle. Rankings shuffled every time database segments merged because BM25 was calculated per-segment, not globally.

Apache Doris 4.0 with progressive filtering changed everything: structured constraints first (50ms), keyword matching second (200ms), vector search third (100ms), fusion ranking last (50ms). IVPQ compression reduced 768-dimension vectors from 3,072 bytes to 8 bytes -- a 384x compression ratio. Relevance jumped from 58% to 94%. Latency dropped from 2.8 seconds to 400 milliseconds. Memory shrank from 10TB across 20-30 servers to 500GB on one server.

7x faster

2.8s down to 400ms

20x

less memory

94%

retrieval relevance

1 server

replaced 20-30

AISpeech / 思必驰

Conversational AI company managing 10 billion+ multimodal training samples across 500TB. Before: training data scattered across different storage systems, maintained manually by different teams. Conflicting data versions undermined model consistency. Algorithm engineers wasted time searching for and re-annotating data. After building on Apache Doris: columnar storage compressed annotation data 80%. Version-based partitioning enables millisecond dataset switching -- active versions on SSD, history auto-migrates to HDD. Point query QPS hit 30,000 with row-store optimization (CPU: 80% --> 10%). Now planning upgrade to Doris 4.0 for vector search to fully replace Elasticsearch.

10B+ samples 80% storage reduction 30K QPS 200+ users

Horizon Robotics

Autonomous driving company processing petabytes daily. Replaced three separate systems -- Hive/Iceberg for analytics, Zilliz for vectors, Elasticsearch for search -- with one Doris engine. Four search modes unified: text, vector, label bitmap operations, and JSON metadata. Engineers stopped hopping between systems. Query times dropped from minutes to seconds on approximately 1 trillion records.

3 --> 1 systems ~1T records Minutes --> Seconds

Learn more

Go deeper.

Customer Story

Get started

Start building your knowledge store.

How to Chunk and Embed

Step-by-step guide to preparing your data for VeloDB's hybrid search. Chunking strategies, embedding models, and indexing best practices.

Read the guide -->

Working with Vectors in VeloDB

Create HNSW indexes, run vector similarity queries, configure IVPQ compression, and build hybrid search with RRF.

Read the guide -->

Try VeloDB Cloud

14-day free trial for SaaS. 30 days of free compute for BYOC. No credit card required.

Start Free Trial -->

Bring your retrieval workload.
We will map the path.

A 30-minute architecture review with a VeloDB solutions engineer. Bring your schema, your query patterns, and your freshness requirements. We will show you where a unified knowledge store eliminates complexity.

Book architecture review Start Free Trial

Data sources and embedding pipeline Hybrid search requirements Freshness target and ingest cadence Current systems VeloDB can replace Performance targets (QPS, latency, recall)

Build a real-time knowledge store for generative AI.

The knowledge store that keeps your AI accurate, fast, and current.

Real-time fresh context

94% retrieval relevance

One store for every data type

Where teams deploy VeloDB knowledge stores.

Ground LLM answers in fresh, internal documents

Manage billions of multimodal training samples

Search text, vectors, labels, and metadata in one query

Serve real-time context to AI agents via MCP

The real-time AI context stack.

Data Sources

Context Pipeline

VeloDB Knowledge Store

AI Consumers

Progressive filtering. Hybrid search. Fusion re-ranking.

Structured Pre-Filtering (~50ms)

BM25 Full-Text (~200ms)

Vector Similarity + Fusion (~150ms)

Production results. Not benchmarks.

AISpeech / 思必驰

Horizon Robotics

Trusted by teams building AI at scale.

Go deeper.

ByteDance: How Billion-Scale Vector Search Was Solved with Apache Doris 4.0

CocoIndex + VeloDB: The Real-Time AI Context Stack

Apache Doris 4.0: Native Hybrid Search for AI Workloads

VeloDB for AI: The AI-Ready Analytics Database

Apache Doris 4.1: Unified Storage and Retrieval for AI and Search

How to Build an AI Knowledge Store with VeloDB

Start building your knowledge store.

How to Chunk and Embed

Working with Vectors in VeloDB

Try VeloDB Cloud

Bring your retrieval workload.We will map the path.

Bring your retrieval workload.
We will map the path.