How it works - Hybrid retrieval, source-attributed, no summarising layer
Dense plus sparse plus re-ranked. Every chunk is returned with its source document, page, and snippet, so the AI client — and the human reading along — can always verify the answer.
- Topic
- Agentic retrieval
- Pipeline
- Dense + sparse + re-rank
- Output
- Source-attributed chunks

Three stages, in order
Hybrid retrieval is not a marketing phrase. It’s three concrete stages, each doing a job the others can’t.
Dense retrieval uses vector embeddings to find chunks whose meaning matches the query. It handles paraphrase, synonyms, and the case where the user has no idea what the document calls the thing they’re asking about.
Sparse retrievalis the classic keyword search you already know — it finds chunks whose exact words match the query. It catches the things dense retrieval misses: identifiers, product codes, legal citations, anything that’s precise rather than meaningful.
A re-ranker takes the combined candidate set from both stages and decides which of those chunks actually answer the question. This is where most “RAG demos” fall over — they skip the re-rank, ship the top-k vector results, and the answer quality collapses on anything more nuanced than a paraphrase.
Evidence, not narrative
There is no summarising layer in the middle of Laminae. When the MCP client asks for relevant context, what comes back is the chunks themselves — each one stamped with its source document, page, and snippet.
That is intentional. The moment retrieval starts paraphrasing the source, provenance starts dying quietly. Source attribution either lines up to a specific span of text in a specific document or it doesn’t mean anything — and once an LLM has summarised it, it doesn’t.
What this enables
Because every chunk that reaches the AI client carries its provenance, the client can cite the source in its answer, the user can verify against the original document, and your legal team can audit the trail months later. None of that is possible with a black-box “ask my data” system.
It also means a bad answer can be debugged. Each query emits a retrieval trace: which chunks came from which stage, what scores the re-ranker assigned, what the AI client ultimately used. When something goes wrong, you can see exactly where. We can use this concrete evidence to make retrieval even better over time.
What this is not
- Not a chat UI
- Not a RAG wrapper
- Not fine-tuning
- Not summarising
- Not real-time streaming
- Client-agnostic
If you can't trace an answer back to a specific chunk in a specific document, it shouldn't have been part of the answer in the first place.
- Vector retrieval
- Dense
- BM25 keyword
- Sparse
- Final filter
- Re-rank
- Every query
- Traced