What is the mcp-rag-local MCP Server?

It is a local MCP server that gives AI agents the ability to store and retrieve text, documents, and PDFs by semantic meaning. Powered by Ollama and ChromaDB, it enables knowledge management, contextual memory, and semantic search for your applications.

What tools does mcp-rag-local provide?

It provides tools for storing single or multiple text passages, ingesting PDF files, and retrieving similar texts using semantic search. This enables workflows like building personal knowledge bases, document summarization, and conversational memory for chatbots.

How do I set up mcp-rag-local?

Install uv and Docker, clone the repository, start Ollama and ChromaDB, and configure the MCP server in your client’s configuration file with the specified ports. Environment variables are used for secure configuration.

What are the main use cases?

Use cases include building a semantic knowledge base, document/PDF summarization, enhancing chatbot memory, semantic search, and research data exploration.

How do I secure API keys or ports?

Always use environment variables in your configuration’s env section to avoid hardcoding sensitive information, ensuring security and best practices.

mcp-rag-local MCP Server

A local, semantic memory MCP server for FlowHunt, built with ChromaDB and Ollama. Enables AI agents to memorize and retrieve text, documents, and PDFs by meaning, supporting powerful RAG and knowledge workflows.

MCP Semantic Search RAG Knowledge Management

Get Started

Contact us to host your MCP Server in FlowHunt

support@flowhunt.io

What does “mcp-rag-local” MCP Server do?

The mcp-rag-local MCP Server is designed as a memory server that allows AI assistants to store and retrieve text passages based on their semantic meaning, not just keywords. Leveraging Ollama for generating text embeddings and ChromaDB for vector storage and similarity search, it enables seamless storage (“memorization”) and retrieval of relevant texts for a given query. This empowers AI-driven workflows such as knowledge management, contextual recall, and semantic search. Developers can interact with the server to store individual texts, multiple texts, or even contents of PDF files, and later retrieve the most contextually relevant information, enhancing productivity and contextual awareness in applications.

List of Prompts

No explicit prompt templates are mentioned in the repository or documentation.

List of Resources

No explicit MCP resources are documented in the repository or README.

List of Tools

memorize_text
Allows the server to store a single text passage for future semantic retrieval.
memorize_multiple_texts
Enables batch storage of several texts at once, facilitating bulk knowledge ingestion.
memorize_pdf_file
Reads and extracts up to 20 pages at a time from a PDF file, chunks the content, and memorizes it for semantic retrieval.
retrieve_similar_texts
Retrieves the most relevant stored text passages based on a user’s query, using semantic similarity.

(Tool names inferred from documented usage patterns; exact names may vary in code.)

Use Cases of this MCP Server

Personal Knowledge Base
Developers and users can build a persistent, searchable knowledge base by memorizing articles, notes, or research papers for semantic recall.
Document and PDF Summarization
By memorizing entire PDF documents, users can later query and retrieve relevant sections or summaries, streamlining research and review.
Conversational Memory for Chatbots
Enhance AI assistants or chatbots with long-term, context-aware memory to provide more coherent and contextually relevant responses over time.
Semantic Search Engine
Implement a semantic search feature in applications, allowing users to find relevant information based on meaning, not just keywords.
Research and Data Exploration
Store and query technical documents, code snippets, or scientific literature for rapid, meaning-based retrieval during investigation or development.

How to set it up

Windsurf

Prerequisites:
- Install uv as your Python package manager.
- Ensure Docker is installed and running.
Clone and Install:
- Clone the repository:
  git clone <repository-url>
  cd mcp-rag-local
- Install dependencies using uv.
Start Services:
- Run docker-compose up to start ChromaDB and Ollama.
- Pull the embedding model:
  docker exec -it ollama ollama pull all-minilm:l6-v2

Configure MCP Server:

Add to your Windsurf MCP server configuration (e.g., in mcpServers):

"mcp-rag-local": {
  "command": "uv",
  "args": [
    "--directory",
    "path\\to\\mcp-rag-local",
    "run",
    "main.py"
  ],
  "env": {
    "CHROMADB_PORT": "8321",
    "OLLAMA_PORT": "11434"
  }
}

Save and Restart:
- Save your configuration and restart Windsurf.
Verify Setup:
- Confirm the server is running and accessible.

Claude

Follow steps 1–3 above (prerequisites, clone/install, start services).

Add the following to your Claude MCP configuration:

"mcpServers": {
  "mcp-rag-local": {
    "command": "uv",
    "args": [
      "--directory",
      "path\\to\\mcp-rag-local",
      "run",
      "main.py"
    ],
    "env": {
      "CHROMADB_PORT": "8321",
      "OLLAMA_PORT": "11434"
    }
  }
}

Save and restart Claude.
Verify the server is listed and running.

Cursor

Complete steps 1–3 (as above).

Add to your Cursor configuration:

"mcpServers": {
  "mcp-rag-local": {
    "command": "uv",
    "args": [
      "--directory",
      "path\\to\\mcp-rag-local",
      "run",
      "main.py"
    ],
    "env": {
      "CHROMADB_PORT": "8321",
      "OLLAMA_PORT": "11434"
    }
  }
}

Save and restart Cursor.
Check that the MCP server is operational.

Cline

Repeat steps 1–3 (prerequisites, clone/install, start services).

In Cline configuration, add:

"mcpServers": {
  "mcp-rag-local": {
    "command": "uv",
    "args": [
      "--directory",
      "path\\to\\mcp-rag-local",
      "run",
      "main.py"
    ],
    "env": {
      "CHROMADB_PORT": "8321",
      "OLLAMA_PORT": "11434"
    }
  }
}

Save, restart Cline, and verify the setup.

Securing API Keys

Use environment variables in the env section of your configuration.

Example:

"env": {
  "CHROMADB_PORT": "8321",
  "OLLAMA_PORT": "11434",
  "MY_API_KEY": "${MY_API_KEY}"
}

Ensure sensitive keys are not hardcoded but referenced from your environment.

How to use this MCP inside flows

Using MCP in FlowHunt

To integrate MCP servers into your FlowHunt workflow, start by adding the MCP component to your flow and connecting it to your AI agent:

Click on the MCP component to open the configuration panel. In the system MCP configuration section, insert your MCP server details using this JSON format:

{
  "mcp-rag-local": {
    "transport": "streamable_http",
    "url": "https://yourmcpserver.example/pathtothemcp/url"
  }
}

Once configured, the AI agent is now able to use this MCP as a tool with access to all its functions and capabilities. Remember to change “mcp-rag-local” to whatever the actual name of your MCP server is and replace the URL with your own MCP server URL.

Overview

Section	Availability	Details/Notes
Overview	✅
List of Prompts	⛔	No prompts/templates documented
List of Resources	⛔	No resources documented
List of Tools	✅	memorize_text, memorize_multiple_texts, etc.
Securing API Keys	✅	via `env` in config, example shown
Sampling Support (less important in evaluation)	⛔	Not mentioned

Our opinion

This MCP is straightforward and well-focused on semantic memory, but lacks advanced features like prompt templates, explicit resources, or sampling/roots support. Tooling and setup are clear. Best for simple RAG/local knowledge workflows.

MCP Score

Has a LICENSE	✅ (MIT)
Has at least one tool	✅
Number of Forks	1
Number of Stars	5

Frequently asked questions

: It is a local MCP server that gives AI agents the ability to store and retrieve text, documents, and PDFs by semantic meaning. Powered by Ollama and ChromaDB, it enables knowledge management, contextual memory, and semantic search for your applications.
: It provides tools for storing single or multiple text passages, ingesting PDF files, and retrieving similar texts using semantic search. This enables workflows like building personal knowledge bases, document summarization, and conversational memory for chatbots.
: Install uv and Docker, clone the repository, start Ollama and ChromaDB, and configure the MCP server in your client’s configuration file with the specified ports. Environment variables are used for secure configuration.
: Use cases include building a semantic knowledge base, document/PDF summarization, enhancing chatbot memory, semantic search, and research data exploration.
: Always use environment variables in your configuration’s env section to avoid hardcoding sensitive information, ensuring security and best practices.

Try mcp-rag-local with FlowHunt

Supercharge your AI workflows with semantic memory and local document search using mcp-rag-local. Set up in minutes and transform how your agents recall and reason over knowledge.

Get Started Try it now

Learn more

MCP Memory Server Integration

Integrate FlowHunt with the mcp-rag-local Memory Server to enable advanced semantic storage and retrieval of text data. Leverage AI-driven workflows that utiliz...

Aug 12, 2025 4 min read

AI Semantic Search +4

Qdrant MCP Server

The Qdrant MCP Server integrates the Qdrant vector search engine with FlowHunt, providing a semantic memory layer for AI assistants and LLM-powered applications...

Jun 18, 2025 4 min read

AI MCP Server +5

Ragie MCP Server

The Ragie MCP Server enables AI assistants to perform semantic search and retrieve relevant information from Ragie knowledge bases, enhancing development workfl...

Jun 18, 2025 4 min read

AI MCP Server +4