Kokoro TTS MCP Server

AI TTS MCP Server Text-to-Speech

Contact us to host your MCP Server in FlowHunt

FlowHunt provides an additional security layer between your internal systems and AI tools, giving you granular control over which tools are accessible from your MCP servers. MCP servers hosted in our infrastructure can be seamlessly integrated with FlowHunt's chatbot as well as popular AI platforms like ChatGPT, Claude, and various AI editors.

What does “Kokoro TTS” MCP Server do?

The Kokoro Text to Speech (TTS) MCP Server is a Model Context Protocol (MCP) server that enables AI assistants and clients to generate high-quality speech audio from text input. By connecting AI workflows with this server, users can convert text to .mp3 files and optionally upload them to Amazon S3 or compatible storage. Kokoro TTS leverages advanced models (via HuggingFace spaces and ONNX weights) to provide customizable voices, speeds, and languages, facilitating seamless integration of text-to-speech capabilities into development environments, chatbots, or automation pipelines. This MCP server is especially valuable for scenarios where synthesized speech is needed for accessibility, notifications, or content creation.

List of Prompts

No explicit prompt templates are documented in the repository.

Logo

Ready to grow your business?

Start your free trial today and see results within days.

List of Resources

No explicit resources are documented in the repository files or README.

List of Tools

  • Text-to-Speech Generation
    Converts input text into an .mp3 audio file using Kokoro TTS models. Offers configuration for voice, speed, and language.
  • S3 Upload
    Optionally uploads generated .mp3 files to a specified Amazon S3 bucket/folder if enabled in configuration.
  • Local MP3 Management
    Stores generated .mp3 files in a designated local folder and can automatically delete them after upload or a retention period.

Use Cases of this MCP Server

  • Accessibility Solutions:
    Integrate Kokoro TTS into applications to provide speech feedback for visually impaired users or to read content aloud.
  • Voice Notifications:
    Automate voice alerts in monitoring or IoT systems by converting event messages to speech audio.
  • Content Creation:
    Generate voiceovers for videos, podcasts, or interactive media directly from written scripts.
  • Conversational AI/Chatbots:
    Enable chatbots to respond with spoken output, enhancing user engagement in customer support or virtual assistant scenarios.
  • Audio Archiving & Compliance:
    Create audio records of text-based communications for compliance or archival purposes.

How to set it up

Windsurf

  1. Ensure you have uv and all Kokoro model files downloaded.
  2. Clone the Kokoro TTS MCP repository to your local machine.
  3. Edit your Windsurf configuration file to add the Kokoro TTS MCP server.
  4. Add the following JSON snippet to your mcpServers object:
    {
      "kokoro-tts-mcp": {
        "command": "uv",
        "args": [
          "--directory",
          "/path/toyourlocal/kokoro-tts-mcp",
          "run",
          "mcp-tts.py"
        ],
        "env": {
          "TTS_VOICE": "af_heart",
          "TTS_SPEED": "1.0",
          "TTS_LANGUAGE": "en-us",
          "AWS_ACCESS_KEY_ID": "",
          "AWS_SECRET_ACCESS_KEY": "",
          "AWS_REGION": "us-east-1",
          "AWS_S3_FOLDER": "mp3",
          "S3_ENABLED": "true",
          "MP3_FOLDER": "/path/to/mp3"
        }
      }
    }
    
  5. Save your configuration and restart Windsurf.

Claude

  1. Install prerequisites (Node.js, uv, Kokoro models).
  2. Add the Kokoro TTS MCP server in Claude’s mcpServers section.
  3. Insert the JSON configuration as above.
  4. Save and restart the Claude environment.

Cursor

  1. Download the repository and required model files.
  2. Update the cursor.json or equivalent config to include the Kokoro TTS MCP server.
  3. Copy the provided JSON snippet, updating paths as needed.
  4. Save changes and restart Cursor.

Cline

  1. Clone the repository and configure environment variables.
  2. Edit the Cline configuration, adding the Kokoro TTS MCP server as shown.
  3. Save and restart the Cline client.

Securing API Keys

Always use environment variables to store sensitive information like AWS credentials. Example:

"env": {
  "AWS_ACCESS_KEY_ID": "${AWS_ACCESS_KEY_ID}",
  "AWS_SECRET_ACCESS_KEY": "${AWS_SECRET_ACCESS_KEY}",
  ...
}

Set these variables in your system or CI environment, never hard-code secrets in your configuration files.

How to use this MCP inside flows

Using MCP in FlowHunt

To integrate MCP servers into your FlowHunt workflow, start by adding the MCP component to your flow and connecting it to your AI agent:

FlowHunt MCP flow

Click on the MCP component to open the configuration panel. In the system MCP configuration section, insert your MCP server details using this JSON format:

{
  "kokoro-tts-mcp": {
    "transport": "streamable_http",
    "url": "https://yourmcpserver.example/pathtothemcp/url"
  }
}

Once configured, the AI agent is now able to use this MCP as a tool with access to all its functions and capabilities. Remember to change “kokoro-tts-mcp” to whatever the actual name of your MCP server is and replace the URL with your own MCP server URL.


Overview

SectionAvailabilityDetails/Notes
OverviewText-to-speech server for AI workflows
List of PromptsNo prompt templates found
List of ResourcesNo explicit MCP resources documented
List of ToolsTTS, S3 upload, local file management
Securing API KeysDocumented use of env vars for AWS and config
Sampling Support (less important in evaluation)No mention of LLM sampling feature

Our opinion

Kokoro TTS MCP Server is focused and practical, offering a specialized tool for text-to-speech tasks with cloud integration. It lacks prompt and resource primitives, but is open source, well-configured, and supports secure key management. Sampling and Roots support are not mentioned, limiting advanced agentic capabilities. For TTS use cases, it is robust and useful, though not as feature-rich as more generalized MCP servers.

MCP Score

Has a LICENSE✅ (Apache-2.0)
Has at least one tool
Number of Forks7
Number of Stars39

Frequently asked questions

Integrate Kokoro TTS into Your AI Workflow

Add natural, high-quality speech synthesis to your chatbots and automation with Kokoro TTS MCP Server. Try it in FlowHunt or connect with your own infrastructure.

Learn more

Kokoro TTS
Kokoro TTS

Kokoro TTS

Integrate FlowHunt with Kokoro Text-to-Speech MCP Server to automate high-quality MP3 file generation, enable secure S3 uploads, and streamline TTS delivery for...

4 min read
AI Kokoro TTS +3
SlideSpeak MCP Server
SlideSpeak MCP Server

SlideSpeak MCP Server

The SlideSpeak MCP Server connects AI assistants to the SlideSpeak API, enabling automated, programmatic creation of PowerPoint presentations for business, educ...

4 min read
MCP Server Automation +4
ElevenLabs MCP Server
ElevenLabs MCP Server

ElevenLabs MCP Server

The ElevenLabs MCP Server integrates ElevenLabs text-to-speech API into AI workflows, enabling automated, high-quality voice synthesis, voice management, and au...

4 min read
Text-to-Speech AI Integration +3