ClawMem is an open-source memory engine for Claude Code and AI agents. It runs on-device, giving agents persistent and searchable memory that survives across sessions, compactions, and runtime boundaries. The system integrates via Claude Code hooks, an MCP server (works with any MCP-compatible client including OpenClaw), or a native OpenClaw memory plugin (kind: memory, v0.10.0+).
Agent (Claude Code / OpenClaw / any MCP client)
│
├── Hooks (automatic, ~90% of retrieval)
│ ├── context-surfacing → injects <vault-context> on every prompt
│ ├── decision-extractor → extracts observations after each response
│ ├── handoff-generator → summarizes sessions for continuity
│ ├── feedback-loop → reinforces referenced memories
│ ├── precompact-extract → preserves state before context compaction
│ ├── curator-nudge → surfaces maintenance suggestions
│ └── postcompact-inject → re-injects authoritative context after compaction
│
├── MCP Tools (agent-initiated, ~10%)
│ ├── memory_retrieve → auto-routing entry point
│ ├── query / search / vsearch / intent_search / query_plan
│ ├── get / multi_get / find_similar / find_causal_links / timeline
│ ├── memory_pin / memory_snooze / memory_forget
│ └── lifecycle / vault / maintenance tools
│
└── REST API (optional, for non-MCP clients)
├── POST /retrieve → mirrors memory_retrieve
├── POST /search → direct mode selection
└── GET/POST endpoints → documents, lifecycle, graphs
│
▼
SQLite Vault (WAL mode)
├── documents + content (FTS5 index)
├── vectors (vec0 extension)
├── memory_relations (causal, semantic, temporal, supporting edges)
├── sessions + usage tracking
└── hook_dedupe (heartbeat suppression)
│
▼
GPU Services (llama-server)
├── :8088 — Embedding (EmbeddingGemma-300M default, zembed-1 SOTA upgrade)
├── :8089 — LLM (qmd-query-expansion-1.7B)
└── :8090 — Reranker (qwen3-reranker-0.6B default, zerank-2 SOTA upgrade)
compact=true), then fetch full content only when needed. Minimizes context window usage.clawmem bootstrap gets you running. Multi-vault, cloud embedding, and profiles are opt-in.A filesystem-based context engine is only as useful as what you put into the filesystem. If your vault contains a handful of sparse memory files, the retrieval pipeline has little to work with — BM25 won’t find relevant terms, vector search has few neighbors to compare, and graph traversal has no edges to follow.
The agents that get the most from ClawMem are the ones with rich, diverse collections. Index broadly:
| Content type | Example | Why it helps |
|---|---|---|
| Memory files | MEMORY.md, session logs |
Captures what happened and what was decided |
| Research outputs | Analysis notes, comparison docs | Gives the agent domain knowledge it would otherwise lack |
| Decision records | Architecture decisions, tradeoffs | Lets the agent understand why, not just what |
| Learnings and antipatterns | Post-mortems, things to avoid | Prevents repeated mistakes across sessions |
| Domain expertise | Reference docs, runbooks, SOPs | Provides stable context that rarely changes |
| Project notes | Status updates, meeting notes, specs | Keeps the agent current on project state |
A practical starting point: configure each project collection to index every .md file in the project (pattern: "**/*.md"). The composite scoring system handles the rest — decisions and hubs never decay, progress notes fade after 45 days, and the quality multiplier rewards well-structured documents over flat text dumps.
How you write documents affects how well they score. The quality multiplier ranges from 0.7x (penalty) to 1.3x (boost) based on headings, lists, decision keywords, and frontmatter. Five well-structured decision documents with clear headings will consistently outscore fifty single-paragraph notes.
ClawMem indexes prose, not code. Source files (.ts, .py, .go, etc.) are excluded by design — BM25 and vector models trained on natural language perform poorly on code syntax, and code retrieval needs AST-aware, symbol-level tools (call graphs, definitions, references). Document your technical decisions and architecture in markdown. Let the code live in version control and use a dedicated code search tool for code retrieval.
| Runtime | Integration | How | Services needed |
|---|---|---|---|
| Claude Code | Hooks + MCP stdio | clawmem setup hooks + clawmem setup mcp |
watcher + embed timer |
| OpenClaw | Memory plugin + REST API | clawmem setup openclaw (requires OpenClaw v2026.4.11+) |
watcher + embed timer + clawmem serve |
| Any MCP client | MCP stdio | Add to MCP config | watcher + embed timer |
| Web / scripts | REST API | clawmem serve |
watcher + embed timer + clawmem serve |
All integrations share the same SQLite vault. The watcher keeps the index fresh, the embed timer maintains vector embeddings, and the REST API serves OpenClaw agent tools. GPU servers are optional — node-llama-cpp provides in-process fallback (Metal on Apple Silicon, Vulkan where available, CPU as last resort). Fast with GPU acceleration; significantly slower on CPU-only. The curator agent handles periodic maintenance on demand.