Building Zero-Cost Semantic Search for OpenClaw with QMD
The Problem
OpenClaw has a killer feature called memory_search - semantic search across your agent's memory files. The catch used to be pretty simple: if you wanted better-than-keyword recall without paying for embeddings, you had to bolt something on yourself.
That's not acceptable for a self-hosted AI agent. If I'm running local, I want everything local. The good news in April 2026: OpenClaw now supports both a builtin memory engine with a local provider and a full QMD backend.
The Solution: QMD
Enter QMD - a semantic search tool that runs entirely on your machine:
- ✅ BM25 keyword search (instant, traditional search)
- ✅ Vector embeddings (semantic search, understands meaning)
- ✅ Hybrid reranking + query expansion (when you use
qmd query) - ✅ Zero API costs (all computation local)
- ✅ Works as OpenClaw's memory sidecar (not just a standalone CLI anymore)
It's exactly what we needed.
Installation Journey
1. Install QMD
QMD is now published as @tobilu/qmd and runs on Node or Bun:
npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmdOn first real use, QMD auto-downloads its GGUF models. If you use the full hybrid stack, budget roughly 2 GB total, not a tiny embeddings-only download.
Models installed:
embeddinggemma-300M-Q8_0(~300 MB, embeddings)qwen3-reranker-0.6b-q8_0(~640 MB, reranking)qmd-query-expansion-1.7B-q4_k_m(~1.1 GB, query expansion)
2. Point OpenClaw at QMD
QMD is an official memory backend now. Enable it in openclaw.json:
{
memory: {
backend: "qmd",
qmd: {
searchMode: "query",
paths: [{ name: "docs", path: "~/notes", pattern: "**/*.md" }]
}
}
}The key names that matter today live under memory.qmd: searchMode, includeDefaultMemory, paths, and sessions.enabled.
3. Let OpenClaw Build the Index
You no longer need to hand-wire collections just to make memory_search useful. OpenClaw creates a managed QMD home under ~/.openclaw/agents/<agentId>/qmd/.
openclaw memory status
openclaw memory index --forceFirst boot and first search can be slow because QMD may still be downloading models and building embeddings. After that, OpenClaw keeps it refreshed in the background.
Testing & Reality Check
✅ What Works
Keyword search (instant):
qmd search "semantic search" -n 5Semantic search (understands meaning):
qmd vsearch "finding information by meaning" -n 5Both work beautifully. Fast, relevant, local.
⚠️ The Real Gotcha
QMD still needs indexing work somewhere. If you run QMD standalone, you still care about qmd update and qmd embed. But when OpenClaw owns QMD as the backend, it already does that for you on boot and on a periodic refresh loop.
qmd update && qmd embedTranslation: manual reindexing is still the right move for ad-hoc CLI collections, but it's no longer the whole OpenClaw story.
🤔 OpenClaw Integration Status
This is the biggest thing that changed since the original setup notes: OpenClaw now officially supports QMD as a memory backend. Set memory.backend = "qmd" and memory_search uses the QMD sidecar.
This means:
- QMD works perfectly via CLI
- OpenClaw's
memory_searchcan use QMD with no embedding API key - Session transcripts can be indexed too with
memory.qmd.sessions.enabled - If QMD fails, OpenClaw falls back to the builtin memory engine
So the old "QMD works, but OpenClaw doesn't actually use it" warning is stale now.
Daily Workflow
There are really two workflows now:
- Standalone QMD CLI: run
qmd updateandqmd embedwhen your collections change - OpenClaw + QMD backend: check health with
openclaw memory statusand force a rebuild withopenclaw memory index --forcewhen needed
Commands Cheat Sheet
# Update index (run after file changes!)
qmd update && qmd embed
# Keyword search
qmd search "query" -n 5
# Semantic search
qmd vsearch "query" -n 5
# Best-quality hybrid search
qmd query "query"
# Check status
qmd status
# OpenClaw-managed memory health
openclaw memory status
openclaw memory index --force
# View collection info
qmd collection listWhy This Matters
Self-hosted AI should be self-sufficient. Depending on external APIs for core features (like memory search) defeats the purpose. QMD gives you:
- Privacy - your data never leaves your machine
- Cost - zero API fees, ever
- Control - no rate limits, no downtime, no ToS changes
- Speed - local inference is fast
What's Next
- Benchmark
searchvsvsearchvsqueryfor your actual corpus - Turn on transcript indexing if you want past chats in recall
- Try a multilingual embedding override with
QMD_EMBED_MODELif your notes aren't English-heavy - Use per-collection models or AST chunking if you're indexing code-heavy docs
Resources
TL;DR
Want semantic search for your OpenClaw agent without API keys?
- Install QMD:
npm install -g @tobilu/qmd - Set
memory.backend = "qmd"inopenclaw.json - Optionally add extra collections with
memory.qmd.paths - Check indexing with
openclaw memory status - Search with
memory_searchin-agent orqmd query "your query"on the CLI
If you're using QMD standalone, remember qmd update && qmd embed. If OpenClaw owns the backend, it handles refreshes for you.