📌Overview

Purpose: The MCP Memory Service aims to provide a robust framework for semantic memory and persistent storage, enhancing conversation contexts and maintaining continuity in applications like Claude Desktop.

Overview: This service integrates with ChromaDB and sentence transformers to enable users to manage long-term memory effectively. It offers advanced capabilities such as semantic search, time-based memory recall, and optimized database management, catering to diverse platform requirements.

Key Features:

Semantic Search: Utilizes sentence transformers for efficient and contextual memory retrieval.
Natural Language Time-Based Recall: Allows users to retrieve memories using intuitive time references (e.g., "last week").
Persistent Storage: Employs ChromaDB to ensure reliable long-term memory preservation, integrated with automatic backup functionality.
Cross-Platform Compatibility: Supports various operating systems including macOS, Windows, and Linux with tailored optimizations.
Memory Management Tools: Equipped with features like duplicate detection, health monitoring, and memory optimization to maintain storage efficiency.

MCP Memory Service

An MCP server providing semantic memory and persistent storage capabilities for Claude Desktop using ChromaDB and sentence transformers. This service enables long-term memory storage with semantic search capabilities, ideal for maintaining context across conversations and instances.

Features

Semantic search using sentence transformers
Natural language time-based recall (e.g., "last week", "yesterday morning")
Tag-based memory retrieval system
Persistent storage using ChromaDB
Automatic database backups
Memory optimization tools
Exact match retrieval
Debug mode for similarity analysis
Database health monitoring
Duplicate detection and cleanup
Customizable embedding model
Cross-platform compatibility (Apple Silicon, Intel, Windows, Linux)
Hardware-aware optimizations for different environments
Graceful fallbacks for limited hardware resources

Quick Start

For the fastest way to get started:

# Install UV if not already installed
pip install uv

# Clone and install
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
uv pip install -e .

# Run the service
uv run memory

Docker and Smithery Integration

Docker Usage

Run the service in a Docker container for isolation and deployment:

# Build the Docker image
docker build -t mcp-memory-service .

# Run the container (example for macOS with proper paths)
docker run -it \
  -v $HOME/mcp-memory/chroma_db:/app/chroma_db \
  -v $HOME/mcp-memory/backups:/app/backups \
  mcp-memory-service

# For production (detached mode)
docker run -d \
  -v $HOME/mcp-memory/chroma_db:/app/chroma_db \
  -v $HOME/mcp-memory/backups:/app/backups \
  --name mcp-memory \
  mcp-memory-service

To configure Docker's file sharing on macOS:

Open Docker Desktop
Go to Settings (Preferences)
Navigate to Resources -> File Sharing
Add any additional paths you need to share
Click "Apply & Restart"

Smithery Integration

The service is configured for Smithery integration through smithery.yaml, enabling stdio-based communication with MCP clients like Claude Desktop.

Use with Smithery by ensuring your claude_desktop_config.json points to the correct Docker command and environment variables.

Testing with Claude Desktop

Build the Docker image:
```
docker build -t mcp-memory-service .
```

Create directories for persistent storage:

mkdir -p $HOME/mcp-memory/chroma_db $HOME/mcp-memory/backups

Update your Claude Desktop configuration file (claude_desktop_config.json in the appropriate OS location).
Restart Claude Desktop.

Verify initialization message:

MCP Memory Service initialization completed

Test memory by asking Claude to remember and recall information.

If issues occur:

Check Claude Desktop console logs.
Verify Docker permissions and running container parameters.
Try running the container manually to observe errors.

For detailed instructions, see documentation in the docs/ directory.

Configuration

Standard Configuration (Recommended)

Example claude_desktop_config.json for using UV:

{
  "memory": {
    "command": "uv",
    "args": [
      "--directory",
      "your_mcp_memory_service_directory",
      "run",
      "memory"
    ],
    "env": {
      "MCP_MEMORY_CHROMA_PATH": "your_chroma_db_path",
      "MCP_MEMORY_BACKUPS_PATH": "your_backups_path"
    }
  }
}

Windows-Specific Configuration

Use the wrapper script for proper PyTorch installation.

{
  "memory": {
    "command": "python",
    "args": [
      "C:\\path\\to\\mcp-memory-service\\memory_wrapper.py"
    ],
    "env": {
      "MCP_MEMORY_CHROMA_PATH": "C:\\Users\\YourUsername\\AppData\\Local\\mcp-memory\\chroma_db",
      "MCP_MEMORY_BACKUPS_PATH": "C:\\Users\\YourUsername\\AppData\\Local\\mcp-memory\\backups"
    }
  }
}

The wrapper script checks and installs PyTorch as needed and runs the memory server.

Hardware Compatibility

Platform	Architecture	Accelerator	Status
macOS	Apple Silicon (M1/M2/M3)	MPS	✅ Fully supported
macOS	Apple Silicon under Rosetta 2	CPU	✅ Supported with fallbacks
macOS	Intel	CPU	✅ Fully supported
Windows	x86_64	CUDA	✅ Fully supported
Windows	x86_64	DirectML	✅ Supported
Windows	x86_64	CPU	✅ Supported with fallbacks
Linux	x86_64	CUDA	✅ Fully supported
Linux	x86_64	ROCm	✅ Supported
Linux	x86_64	CPU	✅ Supported with fallbacks
Linux	ARM64	CPU	✅ Supported with fallbacks

Memory Operations

Core Memory Operations

store_memory - Store information with optional tags
retrieve_memory - Semantic search for relevant memories
recall_memory - Retrieve memories using natural language time expressions
search_by_tag - Find memories by specific tags
exact_match_retrieve - Find memories with exact content match
debug_retrieve - Retrieve memories with similarity scores

For tag storage and management details, see Tag Storage Documentation.

Database Management

create_backup - Create database backup
get_stats - Get memory statistics
optimize_db - Optimize database performance
check_database_health - Get health metrics
check_embedding_model - Verify model status

Memory Management

delete_memory - Delete specific memory by hash
delete_by_tag - Delete all memories with a specific tag
cleanup_duplicates - Remove duplicate entries

Configuration Options

Environment variables to configure the service:

CHROMA_DB_PATH: Path to ChromaDB storage
BACKUP_PATH: Path for backups
AUTO_BACKUP_INTERVAL: Backup interval in hours (default: 24)
MAX_MEMORIES_BEFORE_OPTIMIZE: Threshold for auto-optimization (default: 10000)
SIMILARITY_THRESHOLD: Similarity threshold (default: 0.7)
MAX_RESULTS_PER_QUERY: Max results per query (default: 10)
BACKUP_RETENTION_DAYS: Days to keep backups (default: 7)
LOG_LEVEL: Logging level (default: INFO)

# Hardware-specific
PYTORCH_ENABLE_MPS_FALLBACK: Enable MPS fallback for Apple Silicon (default: 1)
MCP_MEMORY_USE_ONNX: Use ONNX Runtime for CPU deployments (default: 0)
MCP_MEMORY_USE_DIRECTML: Use DirectML for Windows acceleration (default: 0)
MCP_MEMORY_MODEL_NAME: Override the default embedding model
MCP_MEMORY_BATCH_SIZE: Override default batch size

Getting Help

If you encounter issues:

Check the Troubleshooting Guide.
Review the Installation Guide.
For Windows-specific issues, see the Windows Setup Guide.
Contact the developer via Telegram: t.me/doobeedoo

Project Structure

mcp-memory-service/
├── src/mcp_memory_service/      # Core package code
│   ├── __init__.py
│   ├── config.py                # Configuration utilities
│   ├── models/                  # Data models
│   ├── storage/                 # Storage implementations
│   ├── utils/                   # Utility functions
│   └── server.py                # Main MCP server
├── scripts/                     # Helper scripts
│   ├── convert_to_uv.py         # Script to migrate to UV
│   └── install_uv.py            # UV installation helper
├── .uv/                         # UV configuration
├── memory_wrapper.py            # Windows wrapper script
├── memory_wrapper_uv.py         # UV-based wrapper script
├── uv_wrapper.py                # UV wrapper script
├── install.py                   # Enhanced installation script
└── tests/                       # Test suite

Development Guidelines

Use Python 3.10+ with type hints
Use dataclasses for models
Triple-quoted docstrings for modules and functions
Async/await pattern for all I/O operations
Follow PEP 8 style guidelines
Include tests for new features

License

MIT License - See LICENSE file for details

Acknowledgments

ChromaDB team for the vector database
Sentence Transformers project for embedding models
MCP project for protocol specification

Contact

t.me/doobidoo

Cloudflare Worker Implementation

A serverless implementation of the MCP Memory Service is available using Cloudflare Workers, featuring:

Cloudflare D1 (serverless SQLite) for storage
Workers AI for embedding generation
Server-Sent Events (SSE) for MCP protocol communication
No local installation or dependencies required
Automatic scaling

Benefits

Zero local installation
Cross-platform compatibility over the internet
Automatic scaling to handle multiple users
Global distributed low latency access
No maintenance overhead

Available Tools

The Cloudflare Worker supports all tools from the Python implementation:

Tool	Description
`store_memory`	Store new information with optional tags
`retrieve_memory`	Find relevant memories based on query
`recall_memory`	Retrieve memories using natural language time expressions
`search_by_tag`	Search memories by tags
`delete_memory`	Delete a specific memory by its hash
`delete_by_tag`	Delete all memories with a specific tag
`cleanup_duplicates`	Find and remove duplicate entries
`get_embedding`	Get raw embedding vector for content
`check_embedding_model`	Check if embedding model is loaded and working
`debug_retrieve`	Retrieve memories with debug information
`exact_match_retrieve`	Retrieve memories exact content match
`check_database_health`	Check database health and get stats
`recall_by_timeframe`	Retrieve memories within a timeframe
`delete_by_timeframe`	Delete memories within a timeframe
`delete_before_date`	Delete memories before a specific date

Configuring Claude to Use Cloudflare Memory Service

Add to your Claude configuration:

{
  "mcpServers": [
    {
      "name": "cloudflare-memory",
      "url": "https://your-worker-subdomain.workers.dev/mcp",
      "type": "sse"
    }
  ]
}

Replace your-worker-subdomain with your Cloudflare Worker subdomain.

Deploying Your Own Cloudflare Memory Service

Clone the repository and navigate to the Cloudflare Worker directory:

git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service/cloudflare_worker

Install Wrangler (Cloudflare CLI tool):
```
npm install -g wrangler
```
Log in to your Cloudflare account:
```
wrangler login
```
Create a D1 database:
```
wrangler d1 create mcp_memory_service
```
Update wrangler.toml with your database ID.

Initialize the database schema:

wrangler d1 execute mcp_memory_service --local --file=./schema.sql

Schema content:

CREATE TABLE IF NOT EXISTS memories (
  id TEXT PRIMARY KEY,
  content TEXT NOT NULL,
  embedding TEXT NOT NULL,
  tags TEXT,
  memory_type TEXT,
  metadata TEXT,
  created_at INTEGER
);
CREATE INDEX IF NOT EXISTS idx_created_at ON memories(created_at);

Deploy the worker:
```
wrangler deploy
```
Update your Claude configuration with your worker URL.

Testing Your Cloudflare Memory Service

Test with curl commands replacing your-worker-subdomain accordingly:

curl https://your-worker-subdomain.workers.dev/list_tools

Store a memory:

curl -X POST https://your-worker-subdomain.workers.dev/mcp \
  -H "Content-Type: application/json" \
  -d '{"method":"store_memory","arguments":{"content":"This is a test memory","metadata":{"tags":["test"]}}}'

Retrieve memories:

curl -X POST https://your-worker-subdomain.workers.dev/mcp \
  -H "Content-Type: application/json" \
  -d '{"method":"retrieve_memory","arguments":{"query":"test memory","n_results":5}}'

Limitations

Free tier limits on Cloudflare Workers and D1 apply
Workers AI embeddings may differ slightly from local sentence-transformers models
No manual database access
30 seconds max execution time on free plans