Loomal

Free vs Paid Web Scraping APIs for AI Agents free scrapers work until the web fights back.

Self-hosted scrapers handle cooperative sites for nothing. Paid scraping APIs exist because much of the web isn't cooperative: proxies, JavaScript rendering, and anti-bot evasion are the product, and they're billed per page for a reason.

Web scraping has the clearest free/paid boundary of any MCP category, and it's drawn by the target site. A cooperative page — static HTML, no bot defenses — falls to a free, self-hosted scraper in milliseconds. A defended one — JavaScript-heavy, fingerprinting, CAPTCHAs, IP reputation checks — is why an entire industry of scraping APIs exists.

The 31 live servers here span that whole spectrum, from fully open-source extractors like CRW Web Scraper to bridges into commercial scraping APIs like zenrows-mcp and scrapegraph-mcp.

What free scraping handles well

Open-source servers cover an honest majority of everyday extraction. CRW Web Scraper ships scrape, crawl, and map tools with no upstream dependency; webclaw converts any URL to clean markdown with scrape, crawl, extract, and summarize operations; ShadowCrawl takes the adversarial route in Rust, with anti-bot tactics, CDP fallback, and human-in-the-loop handoff for the hardest pages.

Run these on your own infrastructure and documentation sites, blogs, and most public pages cost you nothing but compute. Tools like Librecrawl even cover specialized jobs — self-hosted technical SEO audits with 50+ checks — entirely free.

Where free hits the wall

Three things break self-hosted scraping at scale: IP reputation (your server's address gets flagged after enough requests), JavaScript rendering (headless browsers per page are expensive to operate in fleets), and active anti-bot systems that adapt faster than a side project can. Rotating residential proxies and evasion engineering are recurring costs, not one-time setup.

That's the gap commercial scraping APIs fill. Servers like zenrows-mcp and scrapegraph-mcp front exactly such services — the MCP server is free, but the upstream API meters usage; check each provider's docs for current pricing. The page-fetch infrastructure is the paid product.

Per-page pricing and the x402 fit

Scraping is naturally denominated in pages, which makes it an ideal x402 workload: each fetch is a discrete unit with real marginal cost (proxy bandwidth, rendering compute). An agent hits the endpoint, receives an HTTP 402 quoting the price, pays in USDC, and gets the extracted content — settled on Base in about two seconds, from $0.01 per page, with a signed receipt.

That removes the worst part of scraping APIs for agents: account signup and credit pre-purchase. An agent that needs 40 defended pages for one research task can pay for exactly 40 pages and never touch a pricing page.

Choosing for this category

Self-host free servers for cooperative targets and high steady volume — your compute is cheaper than anyone's API at scale, and tools like webclaw or PageMap (which pitches token-efficient structured extraction for agents) are mature. Pay per page when targets are defended, when you need someone else's proxy fleet, or when volume is too bursty to justify infrastructure.

Loomal's Web Scraping category lists all 31 live servers with descriptions and x402 pricing where maintainers have configured it — the quickest way to see which extraction endpoints an agent can pay autonomously.

Frequently asked questions

Should my agent use a free or paid web scraping MCP server?

Match the tool to the target. Cooperative sites: free self-hosted servers like CRW Web Scraper or webclaw cost only compute. Defended sites need proxy fleets and rendering infrastructure someone has to operate — that's what paid, metered scraping access buys.

Why do scraping APIs charge per page?

Because each page has marginal cost: proxy bandwidth, headless browser compute, and anti-bot engineering amortized per request. Per-page billing maps the provider's cost to your usage — and x402 takes it further by letting an agent pay per page in USDC from $0.01 with no account at all.

How does pay-per-call compare to a subscription for scraping?

Agent scraping is bursty — a research task might need 200 pages today and none for two weeks. x402 charges per fetch, settling on Base in about two seconds, so cost follows the burst. Credit packs and subscriptions only win at sustained, predictable page volume.

Where can I compare web scraping MCP server options?

Loomal's Web Scraping category shows all live listings with descriptions, package types, and per-call pricing where configured — open-source extractors and commercial API bridges on one page.

Run a Web Scraping MCP server?

Claim your listing, set a per-call USDC price, and let AI agents pay for every call over x402.

List it on Loomal