How it works - Hybrid retrieval, source-attributed, no summarising layer

Dense plus sparse plus re-ranked. Every chunk is returned with its source document, page, and snippet, so the AI client — and the human reading along — can always verify the answer.

Topic: Agentic retrieval
Pipeline: Dense + sparse + re-rank
Output: Source-attributed chunks

Three stages, in order

Hybrid retrieval is not a marketing phrase. It’s three concrete stages, each doing a job the others can’t.

Dense retrieval uses vector embeddings to find chunks whose meaning matches the query. It handles paraphrase, synonyms, and the case where the user has no idea what the document calls the thing they’re asking about.

Sparse retrievalis the classic keyword search you already know — it finds chunks whose exact words match the query. It catches the things dense retrieval misses: identifiers, product codes, legal citations, anything that’s precise rather than meaningful.

A re-ranker takes the combined candidate set from both stages and decides which of those chunks actually answer the question. This is where most “RAG demos” fall over — they skip the re-rank, ship the top-k vector results, and the answer quality collapses on anything more nuanced than a paraphrase.

Evidence, not narrative

There is no summarising layer in the middle of Laminae. When the MCP client asks for relevant context, what comes back is the chunks themselves — each one stamped with its source document, page, and snippet.

That is intentional. The moment retrieval starts paraphrasing the source, provenance starts dying quietly. Source attribution either lines up to a specific span of text in a specific document or it doesn’t mean anything — and once an LLM has summarised it, it doesn’t.

What this enables

Because every chunk that reaches the AI client carries its provenance, the client can cite the source in its answer, the user can verify against the original document, and your legal team can audit the trail months later. None of that is possible with a black-box “ask my data” system.

It also means a bad answer can be debugged. Each query emits a retrieval trace: which chunks came from which stage, what scores the re-ranker assigned, what the AI client ultimately used. When something goes wrong, you can see exactly where. We can use this concrete evidence to make retrieval even better over time.

What this is not

Not a chat UI
Not a RAG wrapper
Not fine-tuning
Not summarising
Not real-time streaming
Client-agnostic

If you can't trace an answer back to a specific chunk in a specific document, it shouldn't have been part of the answer in the first place.

David Bunting, Founder, Laminae

Vector retrieval: Dense
BM25 keyword: Sparse
Final filter: Re-rank
Every query: Traced

Location

How it works - Hybrid retrieval, source-attributed, no summarising layer

Three stages, in order

Evidence, not narrative

What this enables

What this is not

More on how Laminae works

Managed cloud, self-hosted, or air-gapped — same product, your call

Upload to a bucket, expose it as an MCP server, done

Want to see Laminae on your own documents?

Based and hosted in