mcp-rag-local MCP Server
A local, semantic memory MCP server for FlowHunt, built with ChromaDB and Ollama. Enables AI agents to memorize and retrieve text, documents, and PDFs by meaning, supporting powerful RAG and knowledge workflows.

What does “mcp-rag-local” MCP Server do?
The mcp-rag-local MCP Server is designed as a memory server that allows AI assistants to store and retrieve text passages based on their semantic meaning, not just keywords. Leveraging Ollama for generating text embeddings and ChromaDB for vector storage and similarity search, it enables seamless storage (“memorization”) and retrieval of relevant texts for a given query. This empowers AI-driven workflows such as knowledge management, contextual recall, and semantic search. Developers can interact with the server to store individual texts, multiple texts, or even contents of PDF files, and later retrieve the most contextually relevant information, enhancing productivity and contextual awareness in applications.
List of Prompts
- No explicit prompt templates are mentioned in the repository or documentation.
List of Resources
- No explicit MCP resources are documented in the repository or README.
List of Tools
memorize_text
Allows the server to store a single text passage for future semantic retrieval.memorize_multiple_texts
Enables batch storage of several texts at once, facilitating bulk knowledge ingestion.memorize_pdf_file
Reads and extracts up to 20 pages at a time from a PDF file, chunks the content, and memorizes it for semantic retrieval.retrieve_similar_texts
Retrieves the most relevant stored text passages based on a user’s query, using semantic similarity.
(Tool names inferred from documented usage patterns; exact names may vary in code.)
Use Cases of this MCP Server
Personal Knowledge Base
Developers and users can build a persistent, searchable knowledge base by memorizing articles, notes, or research papers for semantic recall.Document and PDF Summarization
By memorizing entire PDF documents, users can later query and retrieve relevant sections or summaries, streamlining research and review.Conversational Memory for Chatbots
Enhance AI assistants or chatbots with long-term, context-aware memory to provide more coherent and contextually relevant responses over time.Semantic Search Engine
Implement a semantic search feature in applications, allowing users to find relevant information based on meaning, not just keywords.Research and Data Exploration
Store and query technical documents, code snippets, or scientific literature for rapid, meaning-based retrieval during investigation or development.
How to set it up
Windsurf
- Prerequisites:
- Install uv as your Python package manager.
- Ensure Docker is installed and running.
- Clone and Install:
- Clone the repository:
git clone <repository-url>
cd mcp-rag-local
- Install dependencies using uv.
- Clone the repository:
- Start Services:
- Run
docker-compose up
to start ChromaDB and Ollama. - Pull the embedding model:
docker exec -it ollama ollama pull all-minilm:l6-v2
- Run
- Configure MCP Server:
- Add to your Windsurf MCP server configuration (e.g., in
mcpServers
):"mcp-rag-local": { "command": "uv", "args": [ "--directory", "path\\to\\mcp-rag-local", "run", "main.py" ], "env": { "CHROMADB_PORT": "8321", "OLLAMA_PORT": "11434" } }
- Add to your Windsurf MCP server configuration (e.g., in
- Save and Restart:
- Save your configuration and restart Windsurf.
- Verify Setup:
- Confirm the server is running and accessible.
Claude
- Follow steps 1–3 above (prerequisites, clone/install, start services).
- Add the following to your Claude MCP configuration:
"mcpServers": { "mcp-rag-local": { "command": "uv", "args": [ "--directory", "path\\to\\mcp-rag-local", "run", "main.py" ], "env": { "CHROMADB_PORT": "8321", "OLLAMA_PORT": "11434" } } }
- Save and restart Claude.
- Verify the server is listed and running.
Cursor
- Complete steps 1–3 (as above).
- Add to your Cursor configuration:
"mcpServers": { "mcp-rag-local": { "command": "uv", "args": [ "--directory", "path\\to\\mcp-rag-local", "run", "main.py" ], "env": { "CHROMADB_PORT": "8321", "OLLAMA_PORT": "11434" } } }
- Save and restart Cursor.
- Check that the MCP server is operational.
Cline
- Repeat steps 1–3 (prerequisites, clone/install, start services).
- In Cline configuration, add:
"mcpServers": { "mcp-rag-local": { "command": "uv", "args": [ "--directory", "path\\to\\mcp-rag-local", "run", "main.py" ], "env": { "CHROMADB_PORT": "8321", "OLLAMA_PORT": "11434" } } }
- Save, restart Cline, and verify the setup.
Securing API Keys
- Use environment variables in the
env
section of your configuration. - Example:
"env": { "CHROMADB_PORT": "8321", "OLLAMA_PORT": "11434", "MY_API_KEY": "${MY_API_KEY}" }
- Ensure sensitive keys are not hardcoded but referenced from your environment.
How to use this MCP inside flows
Using MCP in FlowHunt
To integrate MCP servers into your FlowHunt workflow, start by adding the MCP component to your flow and connecting it to your AI agent:

Click on the MCP component to open the configuration panel. In the system MCP configuration section, insert your MCP server details using this JSON format:
{
"mcp-rag-local": {
"transport": "streamable_http",
"url": "https://yourmcpserver.example/pathtothemcp/url"
}
}
Once configured, the AI agent is now able to use this MCP as a tool with access to all its functions and capabilities. Remember to change “mcp-rag-local” to whatever the actual name of your MCP server is and replace the URL with your own MCP server URL.
Overview
Section | Availability | Details/Notes |
---|---|---|
Overview | ✅ | |
List of Prompts | ⛔ | No prompts/templates documented |
List of Resources | ⛔ | No resources documented |
List of Tools | ✅ | memorize_text, memorize_multiple_texts, etc. |
Securing API Keys | ✅ | via env in config, example shown |
Sampling Support (less important in evaluation) | ⛔ | Not mentioned |
Our opinion
This MCP is straightforward and well-focused on semantic memory, but lacks advanced features like prompt templates, explicit resources, or sampling/roots support. Tooling and setup are clear. Best for simple RAG/local knowledge workflows.
MCP Score
Has a LICENSE | ✅ (MIT) |
---|---|
Has at least one tool | ✅ |
Number of Forks | 1 |
Number of Stars | 5 |
Frequently asked questions
- What is the mcp-rag-local MCP Server?
It is a local MCP server that gives AI agents the ability to store and retrieve text, documents, and PDFs by semantic meaning. Powered by Ollama and ChromaDB, it enables knowledge management, contextual memory, and semantic search for your applications.
- What tools does mcp-rag-local provide?
It provides tools for storing single or multiple text passages, ingesting PDF files, and retrieving similar texts using semantic search. This enables workflows like building personal knowledge bases, document summarization, and conversational memory for chatbots.
- How do I set up mcp-rag-local?
Install uv and Docker, clone the repository, start Ollama and ChromaDB, and configure the MCP server in your client’s configuration file with the specified ports. Environment variables are used for secure configuration.
- What are the main use cases?
Use cases include building a semantic knowledge base, document/PDF summarization, enhancing chatbot memory, semantic search, and research data exploration.
- How do I secure API keys or ports?
Always use environment variables in your configuration’s env section to avoid hardcoding sensitive information, ensuring security and best practices.
Try mcp-rag-local with FlowHunt
Supercharge your AI workflows with semantic memory and local document search using mcp-rag-local. Set up in minutes and transform how your agents recall and reason over knowledge.