← Back
Cloudflare
Cloudflare launches Browser Rendering crawl endpoint for website-wide scraping with a single API call

New Browser Rendering Crawl Endpoint

Cloudflare has released a new /crawl endpoint for its Browser Rendering service, allowing developers to crawl entire websites with a single API call. The endpoint automatically discovers pages, renders them in a headless browser, and returns content in multiple formats including HTML, Markdown, and structured JSON powered by Workers AI.

Key Features

The crawl endpoint includes several powerful capabilities:

  • Multiple output formats: Extract content as HTML, Markdown, or structured JSON
  • Automatic page discovery: Discovers URLs from sitemaps, page links, or both
  • Crawl scope controls: Configure depth, page limits, and URL path patterns to include or exclude specific sections
  • Incremental crawling: Use modifiedSince and maxAge parameters to skip unchanged pages, reducing time and cost on repeated crawls
  • Static mode: Set render: false to fetch static HTML without spinning up a browser for faster crawling of non-dynamic sites
  • Respectful crawling: Honors robots.txt directives and AI Crawl Control by default, identifying itself as a signed agent

How It Works

Crawl jobs run asynchronously. Developers submit a starting URL and receive a job ID to check back for results as pages are processed. The endpoint is available on both Workers Free and Paid plans.

Important Limitations

Note that the /crawl endpoint cannot bypass Cloudflare bot detection or CAPTCHAs and self-identifies as a bot, which may impact crawling of some protected sites.

For detailed setup instructions, refer to the crawl endpoint documentation.