AI · Vector Databases
Cost, freshness, recall — pick two cleanly.
Embedding selection, index design, hybrid search, and ANN tuning across pgvector, FAISS, Qdrant, Milvus, and managed stores.
Overview
The right vector store depends on where you draw the line between cost, freshness, and recall — and on whether you want operational responsibility for the index.
What it is
Storage and search for embeddings.
A vector database stores embeddings — high-dimensional numeric representations of text, images, or other content — and answers nearest-neighbor queries against them. It is the substrate beneath any retrieval system that wants to match by meaning rather than exact tokens.
Choosing one is an engineering trade-off, not a brand decision. We pick based on measured recall, freshness needs, total cost of ownership, and whether you want a database to run or a service to consume.
Workflow
The trade-off triangle and the decision flow.
- Trade-off triangle: vertices Cost, Freshness, and Recall. The interior point shows the deliberate compromise.
- If the workload is embedded inside Postgres, pgvector is the default.
- Otherwise, decide whether to self-host. If yes and latency is critical, choose FAISS. If self-host but not latency-critical, choose Qdrant or Milvus.
- If not self-hosting, choose a managed service such as Pinecone or Weaviate Cloud.
Deliverables
What you walk away with.
- Embedding-model selection grounded in domain evaluation, not benchmark folklore.
- Index design: dimensionality, metric, ANN parameters, and the recall/latency trade-off you chose.
- Hybrid search: dense plus sparse plus filters, with metadata schema and query patterns documented.
- Capacity and cost model: vectors, replicas, freshness window, and the unit economics of growth.
- Migration plan when the stack you started with is not the stack you should run at scale.
Pitfalls
How we don't do it.
- Picking a vector store on a Twitter thread instead of measured recall on your own queries.
- Defaulting to a 1536-dim embedding without considering cost, freshness, or recall on your data.
- Treating the vector index as the only retrieval signal — ignoring filters, BM25, and metadata.
- Indexing once, never re-embedding, and wondering why quality decays as the corpus shifts.
Engagement
How we work with you.
-
01
Discover
Corpus volume, query patterns, freshness needs, and the budget envelope.
-
02
Architect
Embedding model, index, metric, and the trade-offs you accept by design.
-
03
Build
Ingest, hybrid search, filters, and the eval harness that measures recall.
-
04
Operate
Re-embedding cadence, capacity reviews, and a migration path when needs shift.
Pick the store that fits the workload.
Tell us your corpus size, freshness window, and budget. We'll come back with measured recall and a clear recommendation.
Related