
Markitdown MCP Server
The Markitdown MCP Server bridges AI assistants with markdown content, enabling automated documentation, content analysis, and markdown file management for enha...
Automate robust, AI-powered web scraping and Markdown conversion—even on interactive or protected sites—using the Puppeteer Vision MCP Server.
The Puppeteer Vision MCP Server enables AI assistants to scrape and convert web pages into Markdown format using Puppeteer, Readability, and Turndown. It offers advanced AI-driven interaction to automatically handle web elements like cookie banners, CAPTCHAs, paywalls, and more, ensuring robust content extraction even from interactive or protected sites. The server exposes this capability via the Model Context Protocol (MCP), making it easy to integrate into AI development workflows. This allows tasks such as automated web scraping, content summarization, and data ingestion to be performed seamlessly by LLMs. The server is easily deployable via npx
, requires minimal configuration, and supports both stdio and SSE communication for flexible integration.
No prompt templates are mentioned in the repository or documentation.
No specific MCP resources are listed or described in the repository or documentation.
url
(string, required): The webpage to scrape.autoInteract
(boolean, optional, default: true): Whether to automatically handle interactive elements.maxInteractionAttempts
(number, optional, default: 3): Maximum AI interaction attempts.waitForNetworkIdle
(boolean, optional, default: true): Wait for network to be idle before scraping.Prerequisites: Install Node.js and npm.
Environment Setup: Create a .env
file or export the required environment variables, including OPENAI_API_KEY
.
Edit Configuration: Locate Windsurf’s configuration file.
Add Puppeteer Vision MCP: Insert the following JSON snippet:
{
"mcpServers": {
"web-scraper": {
"command": "npx",
"args": ["-y", "puppeteer-vision-mcp-server"],
"env": {
"OPENAI_API_KEY": "YOUR_OPENAI_API_KEY_HERE"
}
}
}
}
Save/Restart: Save the file and restart Windsurf.
Verify: Check logs or UI to confirm the MCP server is running.
Securing API Keys:
Store secrets in environment variables (e.g., .env
):
"env": {
"OPENAI_API_KEY": "${OPENAI_API_KEY}"
}
Prerequisites: Ensure Node.js and npm are installed.
Set Environment: Prepare .env
or export OPENAI_API_KEY
and other variables.
Edit Configuration: Open Claude’s MCP configuration.
Add the MCP Server:
{
"mcpServers": {
"web-scraper": {
"command": "npx",
"args": ["-y", "puppeteer-vision-mcp-server"],
"env": {
"OPENAI_API_KEY": "YOUR_OPENAI_API_KEY_HERE"
}
}
}
}
Restart Claude: Apply changes and restart the platform.
Verify: Confirm successful startup.
Prerequisites: Install Node.js and npm.
Environment: Set up .env
with the OpenAI API key.
Edit Cursor Config: Add the MCP server as below:
{
"mcpServers": {
"web-scraper": {
"command": "npx",
"args": ["-y", "puppeteer-vision-mcp-server"],
"env": {
"OPENAI_API_KEY": "YOUR_OPENAI_API_KEY_HERE"
}
}
}
}
Save & Restart: Save changes and restart Cursor.
Check Logs: Ensure the server is running.
Prerequisites: Install Node.js and npm.
Environment: Set or export OPENAI_API_KEY
.
Configuration: Add to Cline’s MCP config:
{
"mcpServers": {
"web-scraper": {
"command": "npx",
"args": ["-y", "puppeteer-vision-mcp-server"],
"env": {
"OPENAI_API_KEY": "YOUR_OPENAI_API_KEY_HERE"
}
}
}
}
Restart Cline: Apply and restart.
Confirm: Validate that the server is accessible.
Note: Secure API keys via environment variables and never hard-code secrets in config files.
Using MCP in FlowHunt
To integrate MCP servers into your FlowHunt workflow, start by adding the MCP component to your flow and connecting it to your AI agent:
Click on the MCP component to open the configuration panel. In the system MCP configuration section, insert your MCP server details using this JSON format:
{
"puppeteer-vision": {
"transport": "streamable_http",
"url": "https://yourmcpserver.example/pathtothemcp/url"
}
}
Once configured, the AI agent is now able to use this MCP as a tool with access to all its functions and capabilities. Remember to change “puppeteer-vision” to whatever the actual name of your MCP server is and replace the URL with your own MCP server URL.
Section | Availability | Details/Notes |
---|---|---|
Overview | ✅ | Provided in README. |
List of Prompts | ⛔ | No prompt templates found. |
List of Resources | ⛔ | No explicit MCP resources described. |
List of Tools | ✅ | scrape-webpage tool, detailed in README. |
Securing API Keys | ✅ | Instructions for .env and environment variables given. |
Sampling Support (less important in evaluation) | ⛔ | No mention of sampling support. |
| Roots Support | ⛔ | No mention of Roots. |
Based on the above, the Puppeteer Vision MCP Server offers a robust and focused web scraping tool with strong documentation and security guidance, but lacks multiple tools, prompt templates, resources, and advanced MCP features like roots or sampling. Its one-tool, one-purpose design gives it high reliability for its use case, but limits extensibility.
MCP Score: 5/10
This MCP server is well-documented, useful for its specific purpose, and easy to set up, but its lack of prompt templates, explicit resources, and advanced MCP features (roots, sampling) limits its versatility and ecosystem integration.
Has a LICENSE | ⛔ |
---|---|
Has at least one tool | ✅ |
Number of Forks | 5 |
Number of Stars | 12 |
It is an MCP server that allows AI agents to scrape and convert web pages to Markdown using Puppeteer, Readability, and Turndown. It can automatically interact with and bypass common web barriers (like CAPTCHAs and cookie banners), enabling robust content extraction for ingestion into AI workflows.
Automated web scraping for knowledge ingestion, bypassing interactive barriers, summarization and content analysis, real-time browser automation, and seamless integration into LLM orchestration pipelines.
Configure it in your orchestrator’s MCP server config, specifying the command and environment variables (including your OpenAI API key). Detailed instructions are provided for Windsurf, Claude, Cursor, and Cline above.
It uses AI-powered automation to interact with, dismiss, or bypass web elements such as cookie banners, CAPTCHAs, and paywalls, ensuring content extraction even from protected or interactive sites.
Yes. Always store API keys in environment variables or `.env` files. Never hard-code secrets in configuration files.
The main tool is `scrape-webpage`, which scrapes a given URL, interacts with web elements as needed, and outputs the main content as Markdown.
Supercharge your AI workflows with advanced web scraping and content extraction. Set up Puppeteer Vision MCP Server in minutes and start ingesting the live web into your AI pipelines.
The Markitdown MCP Server bridges AI assistants with markdown content, enabling automated documentation, content analysis, and markdown file management for enha...
The Model Context Protocol (MCP) Server bridges AI assistants with external data sources, APIs, and services, enabling streamlined integration of complex workfl...
The browser-use MCP Server empowers AI agents to control web browsers programmatically using the browser-use library. It enables automated browsing, data extrac...