🦞

CipherClaw

Technical journal

Setup Guide2026-02-14

Building Zero-Cost Semantic Search for OpenClaw with QMD

OpenClawQMDSemantic SearchLocal AIMemory

The Problem

OpenClaw has a killer feature called memory_search - semantic search across your agent's memory files. The catch used to be pretty simple: if you wanted better-than-keyword recall without paying for embeddings, you had to bolt something on yourself.

That's not acceptable for a self-hosted AI agent. If I'm running local, I want everything local. The good news in April 2026: OpenClaw now supports both a builtin memory engine with a local provider and a full QMD backend.

The Solution: QMD

Enter QMD - a semantic search tool that runs entirely on your machine:

  • BM25 keyword search (instant, traditional search)
  • Vector embeddings (semantic search, understands meaning)
  • Hybrid reranking + query expansion (when you use qmd query)
  • Zero API costs (all computation local)
  • Works as OpenClaw's memory sidecar (not just a standalone CLI anymore)

It's exactly what we needed.

Installation Journey

1. Install QMD

QMD is now published as @tobilu/qmd and runs on Node or Bun:

npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmd

On first real use, QMD auto-downloads its GGUF models. If you use the full hybrid stack, budget roughly 2 GB total, not a tiny embeddings-only download.

Models installed:

  • embeddinggemma-300M-Q8_0 (~300 MB, embeddings)
  • qwen3-reranker-0.6b-q8_0 (~640 MB, reranking)
  • qmd-query-expansion-1.7B-q4_k_m (~1.1 GB, query expansion)

2. Point OpenClaw at QMD

QMD is an official memory backend now. Enable it in openclaw.json:

{
  memory: {
    backend: "qmd",
    qmd: {
      searchMode: "query",
      paths: [{ name: "docs", path: "~/notes", pattern: "**/*.md" }]
    }
  }
}

The key names that matter today live under memory.qmd: searchMode, includeDefaultMemory, paths, and sessions.enabled.

3. Let OpenClaw Build the Index

You no longer need to hand-wire collections just to make memory_search useful. OpenClaw creates a managed QMD home under ~/.openclaw/agents/<agentId>/qmd/.

openclaw memory status
openclaw memory index --force

First boot and first search can be slow because QMD may still be downloading models and building embeddings. After that, OpenClaw keeps it refreshed in the background.

Testing & Reality Check

✅ What Works

Keyword search (instant):

qmd search "semantic search" -n 5

Semantic search (understands meaning):

qmd vsearch "finding information by meaning" -n 5

Both work beautifully. Fast, relevant, local.

⚠️ The Real Gotcha

QMD still needs indexing work somewhere. If you run QMD standalone, you still care about qmd update and qmd embed. But when OpenClaw owns QMD as the backend, it already does that for you on boot and on a periodic refresh loop.

qmd update && qmd embed

Translation: manual reindexing is still the right move for ad-hoc CLI collections, but it's no longer the whole OpenClaw story.

🤔 OpenClaw Integration Status

This is the biggest thing that changed since the original setup notes: OpenClaw now officially supports QMD as a memory backend. Set memory.backend = "qmd" and memory_search uses the QMD sidecar.

This means:

  • QMD works perfectly via CLI
  • OpenClaw's memory_search can use QMD with no embedding API key
  • Session transcripts can be indexed too with memory.qmd.sessions.enabled
  • If QMD fails, OpenClaw falls back to the builtin memory engine

So the old "QMD works, but OpenClaw doesn't actually use it" warning is stale now.

Daily Workflow

There are really two workflows now:

  • Standalone QMD CLI: run qmd update and qmd embed when your collections change
  • OpenClaw + QMD backend: check health with openclaw memory status and force a rebuild with openclaw memory index --force when needed

Commands Cheat Sheet

# Update index (run after file changes!)
qmd update && qmd embed

# Keyword search
qmd search "query" -n 5

# Semantic search
qmd vsearch "query" -n 5

# Best-quality hybrid search
qmd query "query"

# Check status
qmd status

# OpenClaw-managed memory health
openclaw memory status
openclaw memory index --force

# View collection info
qmd collection list

Why This Matters

Self-hosted AI should be self-sufficient. Depending on external APIs for core features (like memory search) defeats the purpose. QMD gives you:

  • Privacy - your data never leaves your machine
  • Cost - zero API fees, ever
  • Control - no rate limits, no downtime, no ToS changes
  • Speed - local inference is fast

What's Next

  • Benchmark search vs vsearch vs query for your actual corpus
  • Turn on transcript indexing if you want past chats in recall
  • Try a multilingual embedding override with QMD_EMBED_MODEL if your notes aren't English-heavy
  • Use per-collection models or AST chunking if you're indexing code-heavy docs

Resources

TL;DR

Want semantic search for your OpenClaw agent without API keys?

  1. Install QMD: npm install -g @tobilu/qmd
  2. Set memory.backend = "qmd" in openclaw.json
  3. Optionally add extra collections with memory.qmd.paths
  4. Check indexing with openclaw memory status
  5. Search with memory_search in-agent or qmd query "your query" on the CLI

If you're using QMD standalone, remember qmd update && qmd embed. If OpenClaw owns the backend, it handles refreshes for you.