How it works - Upload to a bucket, expose it as an MCP server, done
Buckets are managed, vectorised knowledge stores. MCP servers are how your AI client talks to them. Non-technical teams can spin up both in an afternoon — no data team required.
- Topic
- Buckets and MCPs
- Interface
- MCP (open standard)
- Clients
- Any MCP-compatible

A bucket is a managed knowledge store
You create a bucket for each knowledge domain that should stay separate — a policy library, a claims handbook, a product catalogue, a year of board minutes. Each bucket has its own access list, its own ingestion pipeline, and its own MCP endpoint. Multiple buckets are relevant? Not a problem, one MCP can easily be assigned multiple buckets, whilst also preserving the data seperation.
Today buckets accept PDF, DOCX, CSV, JSON, plain text, Markdown, and Confluence exports. As well as connecting directly to Confluence, Gitlab and Github.
Audio and video are on the roadmap, but not in the product yet. Behind the scenes Laminae handles chunking, embedding, sparse indexing, and the hard reference back to the source document — the part you can’t bolt on later.
An MCP server is the front door
MCP — the Model Context Protocol — is the open standard for connecting AI clients to external context. Any bucket(s) can expose its own MCP server on a stable URL with per-tenant auth. Any MCP-compatible client connects: Claude Desktop, Cursor, Continue, an internal Copilot you’ve built in-house.
There’s no Laminae SDK to install and no wrapper to maintain. If the open standard evolves, your integration evolves with it. If Laminae goes away tomorrow, your team keeps using the same client against whichever MCP server comes next.
You bring your own AI client
The AI client your team already uses is better than anything we would build — our job is to make it smarter on your own data. The user’s day-to-day looks like their existing tool, just suddenly informed about the company they actually work at.
When the client asks a Laminae MCP a question, what comes back is auditable. Matching chunks with source attribution attached. The client generates the answer; the human reading it can always verify.
Why repeat uploads quietly cost you money
Without persistent storage, the default pattern is the same one everyone falls into: drag the PDF into the chat, ask the question, get the answer. Then close the conversation. Tomorrow, the next person on the team needs something from the same document — they drag it in again. Same document, same embedding work, billed again.
Multiply that across a policy library, a claims handbook, a year of board minutes, and the entire customer-success team, and you’re paying to convert the same text into the same vectors thousands of times. Laminae embeds each document once, on ingest, and serves the relevant chunks on demand. The unit economics stop drifting upward as your team uses the system more.
The second failure mode is quieter. Drop a 200-page contract into a chat and the model has to either truncate it or paginate over it — either way, parts of the document quietly fall out of context, and the answer starts hallucinating the bits it can’t see. Laminae sends only the chunks that match the question, so the context budget goes to reasoning, not to padding the prompt with text the model is going to drop anyway.
Why one bucket per concept beats one giant index
The instinct, when teams first hear “vector database”, is to pour everything in. One index, one search, one mental model. It feels tidy, but isn’t.
Embedding-space proximity is blunt. A paragraph about parental leave from your HR handbook lives near a paragraph about “extended absence” in your customer-support playbook, because those phrases share meaning even though they belong to entirely different domains. Search the giant index and both come back. Re-rankers can sometimes catch this, but they can’t fix the input set being polluted from the start — the noise is already in the candidate pool.
Worse, large knowledge graphs traverse incorrectly. Ask about a single project, and the search wanders into adjacent projects with overlapping vocabulary. The answer comes back confidently mixing facts from three places, none of which you intended to ask about. That’s the failure mode that’s hardest to detect, because the answer reads like it should be right.
Scoping the retrieval surface up front avoids both. One bucket per concept or project, each with its own MCP endpoint, each searched independently. The customer-support assistant queries the support bucket; the HR assistant queries the HR bucket; the board minutes live in their own bucket and nobody accidentally surfaces them in an unrelated conversation. The retrieval surface is bounded by the question’s domain, and accuracy goes up because the input set was never polluted in the first place.
What you get
- Per-bucket access control
- PDF, DOCX, CSV, JSON, MD
- Confluence connector
- MCP endpoint per bucket
- Claude Desktop · Cursor · Continue
- Source-attributed responses
The customer's AI client gets smarter on the customer's own data. We don't ship a chat UI — we make yours useful.
- Standard (MCP)
- Open
- SDK to install
- No
- Your own AI client
- Bring
- Chat UI shipped
- No