Interests

Areas I think about, research, and want to go deeper on.

Vector Search at Scale

The engineering challenges of building high-recall, low-latency vector search across hundreds of millions of vectors: indexing strategies, quantization tradeoffs, and how approximate and exact search methods complement each other.

hnswannquantizationdedicated-search-nodes

Hybrid & Multimodal Retrieval

Combining lexical and semantic signals for more robust retrieval, and extending that to multimodal inputs (text, images, charts, and structured metadata) within a unified query interface.

hybrid-searchrank-fusionmultimodalembeddings

Knowledge Distillation for Embedding Models

Techniques for compressing large, high-quality embedding models into smaller, faster ones without significant accuracy loss, with direct relevance to the LEAF framework and efficient retrieval in resource-constrained environments.

knowledge-distillationembedding-modelsllmsleaf

LEAF: Knowledge Distillation of Text Embedding Models ↗

Multivector Retrieval

Late interaction models (ColBERT-style) and multi-vector representations that store multiple embeddings per document for richer, token-level matching, along with how retrieval engines are evolving to support them.

colbertlate-interactionmultivectorretrieval

Reflections on multivector retrieval engines ↗

AI-Native Application Architecture

How developers build applications where LLMs and vector databases are first-class primitives, including RAG pipelines, agentic patterns, and the evolving role of the database in AI stacks.

ragai-agentsvector-databasesapplication-architecture

Disaggregated Storage & Compute

How separating storage from compute (à la Aurora, Snowflake, cloud-native databases) changes what's possible for scalability and cost, an area I'm currently exploring through hands-on projects.

distributed-systemscloud-databasesstorage-compute-separation