Build vs. buy · Agent memory
Build your own retriever.
Or don't.
Our benchmark settled one thing: your agent needs memory — answer accuracy went from 0% without it to ~75%with it. You could match us on simple lookups by building keyword RAG yourself. The question is whether you want to build, tune, host, and maintain that — or drop in a managed memory layer that also does what RAG can't.
Where MMPM sits: the L2 cache for AI
Your vector database is main memory. MMPM is the L2 cache in front of it — the fast, predictive, verifiable tier that keeps the right context warm before your agent asks. Verify a memory yourself →
Build your own RAG, or use MMPM
| Capability | Build-your-own RAG you own it | MMPM managed |
|---|---|---|
| Simple keyword lookup | Yes | Yes — 100% |
| Multi-hop recall (answer shares no words with the question) | No — 0% in test | Only arm that answered any |
| Verifiable provenance (Merkle proofs) | No | Every atom |
| Knowledge-graph edges (relationships, not chunks) | No | Built in |
| Conflict detection (stale facts flagged) | No | Built in |
| Cross-session persistence | You build & host it | Built in |
| MCP-native — drops into your agent | You wire it | One endpoint |
| Who builds, tunes, hosts & maintains it | You | Managed for you |
| Cost | Engineering time + infra | From $5/mo |
The honest row is the first one: on simple keyword lookups, a RAG you build can match MMPM. Every row below it is what you'd still be missing — or still be maintaining.
What you're paying for
Managed
The retriever you don't build
No chunking, embeddings, vector database, or ops to run. MMPM drops into your agent over a single MCP endpoint — the memory layer is someone else's problem to keep alive.
Verifiable
Every memory, provable
Each atom is sealed in an RFC 6962 Merkle tree — tamper-evident and auditable. You can prove what your agent knew, and when. Keyword retrieval can't offer that.
Predictive
The right context before you ask
Markov spreading activation surfaces facts that share no words with your query — the one capability that beat keyword retrieval in our benchmark, and the reason memory is more than search.
One managed layer, priced to scale
Every tier ships Merkle proofs, Markov prediction, knowledge-graph edges, and MCP-native access.
The evidence behind the claims
A controlled retrieval + answer benchmark (Opus 4.8) on our real 3,716-fact production substrate. Deterministic and reproducible.
Answer accuracy with no memory (or a recency prompt) vs. with MMPM-retrieved context. The no-context control scored 0%, proving answers come from retrieval, not the model.
On questions whose answer shares no words with the query, keyword RAG scored 0/18. MMPM was the only method to answer any (directional; small sample).
On direct keyword lookups (n=48), MMPM and keyword RAG both answered 100%. We report where the baseline wins — it's what makes the rest credible.
Retrieval-side: a recency-maintained prompt surfaced the needed fact 0 of 48 times even at a 32,000-token budget; MMPM surfaced it using about 500 tokens — the same answer on roughly 0.2% of the tokens.
Questions people ask
Can't I just build this with a vector database?+
For simple keyword lookups, yes — in our benchmark, keyword retrieval tied MMPM at 100%. But that's a retriever you build, tune, host, and keep alive, and it still can't do multi-hop recall, give you Merkle-verifiable provenance, a knowledge graph, or conflict detection. MMPM is all of that, managed, from $5/mo.
So does MMPM actually beat RAG?+
On simple keyword lookups, no — it's a tie (both answered 100%). We say that plainly. MMPM's edge is threefold: multi-hop recall (it was the only method to answer any multi-hop question in our test), cryptographic verifiability, and the fact that it isn't your team's problem to operate.
Is the benchmark run on a real system?+
Yes — on our own production substrate, the same one we run our SaaS on, hardened across many revisions. Not a toy corpus. The numbers are deterministic and reproducible; the harness, probes, and seeds are in the repo.
Why did the no-memory baseline score 0%?+
The facts are private to the substrate, so the model can't know them from training. With no retrieval it correctly refuses rather than guessing — which is exactly why any score above zero is attributable to the memory layer, not the model.
What do I actually get at each price?+
Every tier ships the differentiators — Merkle proofs, Markov prediction, knowledge-graph edges, MCP-native — and scales on atoms and infrastructure: Starter ($5) and Solo ($9) on shared infra, Professional ($29, most popular) and Team ($79) on dedicated infrastructure, Enterprise on custom or self-hosted.
Skip the retriever. Keep the memory.
Verifiable, connected, and predictive memory behind one MCP endpoint — from $5/mo.