Loomal

TheCrawler

MCP server by github.com/manchittlab/thecrawler

Universal web scraper with LLM-ready markdown, RAG chunking, PDF/DOCX support.

0 starsnpm: thecrawler

About TheCrawler

TheCrawler is an MCP (Model Context Protocol) server published by manchittlab in the official MCP registry, listed under Web Scraping on Loomal. Universal web scraper with LLM-ready markdown, RAG chunking, PDF/DOCX support.

It ships as an npm package (thecrawler), so any MCP client that can launch a local process can run it.

Development happens in the open at github.com/manchittlab/thecrawler.

Use TheCrawler with your agent

Claude Code · one command
claude mcp add thecrawler -- npx -y thecrawler
Claude Desktop, Cursor & other MCP clients · config
{
  "mcpServers": {
    "thecrawler": {
      "command": "npx",
      "args": [
        "-y",
        "thecrawler"
      ]
    }
  }
}
npmthecrawler

Frequently asked questions

What is TheCrawler?
TheCrawler is an MCP (Model Context Protocol) server by manchittlab in the Web Scraping category. Universal web scraper with LLM-ready markdown, RAG chunking, PDF/DOCX support.
How do I connect TheCrawler to Claude, Cursor, or another MCP client?
Install TheCrawler from its npm package (thecrawler) and register it under "mcpServers" in your client's MCP configuration — for example claude_desktop_config.json or Cursor's mcp.json — then restart the client.
Is TheCrawler open source?
Yes — the source code is public at github.com/manchittlab/thecrawler.
Can AI agents pay to use TheCrawler?
Not yet through Loomal — TheCrawler is listed as a free directory entry. If its maintainer verifies ownership, they can set per-call USDC pricing that agents pay over x402, with settlement on Base.

Listing data from the official MCP registry and GitHub, refreshed periodically. Not affiliated with the maintainer unless claimed. Maintain TheCrawler? Claim this listing free by verifying GitHub ownership, or contact us.