mcp-rag-local MCP Server

A local, semantic memory MCP server for FlowHunt, built with ChromaDB and Ollama. Enables AI agents to memorize and retrieve text, documents, and PDFs by meaning, supporting powerful RAG and knowledge workflows.

mcp-rag-local MCP Server

What does “mcp-rag-local” MCP Server do?

The mcp-rag-local MCP Server is designed as a memory server that allows AI assistants to store and retrieve text passages based on their semantic meaning, not just keywords. Leveraging Ollama for generating text embeddings and ChromaDB for vector storage and similarity search, it enables seamless storage (“memorization”) and retrieval of relevant texts for a given query. This empowers AI-driven workflows such as knowledge management, contextual recall, and semantic search. Developers can interact with the server to store individual texts, multiple texts, or even contents of PDF files, and later retrieve the most contextually relevant information, enhancing productivity and contextual awareness in applications.

List of Prompts

  • No explicit prompt templates are mentioned in the repository or documentation.

List of Resources

  • No explicit MCP resources are documented in the repository or README.

List of Tools

  • memorize_text
    Allows the server to store a single text passage for future semantic retrieval.

  • memorize_multiple_texts
    Enables batch storage of several texts at once, facilitating bulk knowledge ingestion.

  • memorize_pdf_file
    Reads and extracts up to 20 pages at a time from a PDF file, chunks the content, and memorizes it for semantic retrieval.

  • retrieve_similar_texts
    Retrieves the most relevant stored text passages based on a user’s query, using semantic similarity.

(Tool names inferred from documented usage patterns; exact names may vary in code.)

Use Cases of this MCP Server

  • Personal Knowledge Base
    Developers and users can build a persistent, searchable knowledge base by memorizing articles, notes, or research papers for semantic recall.

  • Document and PDF Summarization
    By memorizing entire PDF documents, users can later query and retrieve relevant sections or summaries, streamlining research and review.

  • Conversational Memory for Chatbots
    Enhance AI assistants or chatbots with long-term, context-aware memory to provide more coherent and contextually relevant responses over time.

  • Semantic Search Engine
    Implement a semantic search feature in applications, allowing users to find relevant information based on meaning, not just keywords.

  • Research and Data Exploration
    Store and query technical documents, code snippets, or scientific literature for rapid, meaning-based retrieval during investigation or development.

How to set it up

Windsurf

  1. Prerequisites:
    • Install uv as your Python package manager.
    • Ensure Docker is installed and running.
  2. Clone and Install:
    • Clone the repository:
      git clone <repository-url>
      cd mcp-rag-local
    • Install dependencies using uv.
  3. Start Services:
    • Run docker-compose up to start ChromaDB and Ollama.
    • Pull the embedding model:
      docker exec -it ollama ollama pull all-minilm:l6-v2
  4. Configure MCP Server:
    • Add to your Windsurf MCP server configuration (e.g., in mcpServers):
      "mcp-rag-local": {
        "command": "uv",
        "args": [
          "--directory",
          "path\\to\\mcp-rag-local",
          "run",
          "main.py"
        ],
        "env": {
          "CHROMADB_PORT": "8321",
          "OLLAMA_PORT": "11434"
        }
      }
      
  5. Save and Restart:
    • Save your configuration and restart Windsurf.
  6. Verify Setup:
    • Confirm the server is running and accessible.

Claude

  1. Follow steps 1–3 above (prerequisites, clone/install, start services).
  2. Add the following to your Claude MCP configuration:
    "mcpServers": {
      "mcp-rag-local": {
        "command": "uv",
        "args": [
          "--directory",
          "path\\to\\mcp-rag-local",
          "run",
          "main.py"
        ],
        "env": {
          "CHROMADB_PORT": "8321",
          "OLLAMA_PORT": "11434"
        }
      }
    }
    
  3. Save and restart Claude.
  4. Verify the server is listed and running.

Cursor

  1. Complete steps 1–3 (as above).
  2. Add to your Cursor configuration:
    "mcpServers": {
      "mcp-rag-local": {
        "command": "uv",
        "args": [
          "--directory",
          "path\\to\\mcp-rag-local",
          "run",
          "main.py"
        ],
        "env": {
          "CHROMADB_PORT": "8321",
          "OLLAMA_PORT": "11434"
        }
      }
    }
    
  3. Save and restart Cursor.
  4. Check that the MCP server is operational.

Cline

  1. Repeat steps 1–3 (prerequisites, clone/install, start services).
  2. In Cline configuration, add:
    "mcpServers": {
      "mcp-rag-local": {
        "command": "uv",
        "args": [
          "--directory",
          "path\\to\\mcp-rag-local",
          "run",
          "main.py"
        ],
        "env": {
          "CHROMADB_PORT": "8321",
          "OLLAMA_PORT": "11434"
        }
      }
    }
    
  3. Save, restart Cline, and verify the setup.

Securing API Keys

  • Use environment variables in the env section of your configuration.
  • Example:
    "env": {
      "CHROMADB_PORT": "8321",
      "OLLAMA_PORT": "11434",
      "MY_API_KEY": "${MY_API_KEY}"
    }
    
  • Ensure sensitive keys are not hardcoded but referenced from your environment.

How to use this MCP inside flows

Using MCP in FlowHunt

To integrate MCP servers into your FlowHunt workflow, start by adding the MCP component to your flow and connecting it to your AI agent:

FlowHunt MCP flow

Click on the MCP component to open the configuration panel. In the system MCP configuration section, insert your MCP server details using this JSON format:

{
  "mcp-rag-local": {
    "transport": "streamable_http",
    "url": "https://yourmcpserver.example/pathtothemcp/url"
  }
}

Once configured, the AI agent is now able to use this MCP as a tool with access to all its functions and capabilities. Remember to change “mcp-rag-local” to whatever the actual name of your MCP server is and replace the URL with your own MCP server URL.


Overview

SectionAvailabilityDetails/Notes
Overview
List of PromptsNo prompts/templates documented
List of ResourcesNo resources documented
List of Toolsmemorize_text, memorize_multiple_texts, etc.
Securing API Keysvia env in config, example shown
Sampling Support (less important in evaluation)Not mentioned

Our opinion

This MCP is straightforward and well-focused on semantic memory, but lacks advanced features like prompt templates, explicit resources, or sampling/roots support. Tooling and setup are clear. Best for simple RAG/local knowledge workflows.

MCP Score

Has a LICENSE✅ (MIT)
Has at least one tool
Number of Forks1
Number of Stars5

Frequently asked questions

What is the mcp-rag-local MCP Server?

It is a local MCP server that gives AI agents the ability to store and retrieve text, documents, and PDFs by semantic meaning. Powered by Ollama and ChromaDB, it enables knowledge management, contextual memory, and semantic search for your applications.

What tools does mcp-rag-local provide?

It provides tools for storing single or multiple text passages, ingesting PDF files, and retrieving similar texts using semantic search. This enables workflows like building personal knowledge bases, document summarization, and conversational memory for chatbots.

How do I set up mcp-rag-local?

Install uv and Docker, clone the repository, start Ollama and ChromaDB, and configure the MCP server in your client’s configuration file with the specified ports. Environment variables are used for secure configuration.

What are the main use cases?

Use cases include building a semantic knowledge base, document/PDF summarization, enhancing chatbot memory, semantic search, and research data exploration.

How do I secure API keys or ports?

Always use environment variables in your configuration’s env section to avoid hardcoding sensitive information, ensuring security and best practices.

Try mcp-rag-local with FlowHunt

Supercharge your AI workflows with semantic memory and local document search using mcp-rag-local. Set up in minutes and transform how your agents recall and reason over knowledge.

Learn more