Universal Scraper

Scrape any public website using the Crawlbyte universal task engine. Designed for flexibility, it supports GET, POST, and custom selectors, JS rendering, headers, proxies, and more.

Endpoint

POST https://api.crawlbyte.ai/api/tasks

Basic Configuration (Required)

{
  "type": "universal",
  "input": [
    "https://example.com/page"
  ],
  "multithread": false
}

Parameters

Field
Type
Description

type

string

Always "universal"

input

array

Array of valid URLs

multithread

boolean

Use true for faster processing with multiple threads

Advanced Configuration (Optional)

{
  "type": "universal",
  "input": [
    "https://example.com/page"
  ],
  "multithread": true,
  "jsRendering": true,
  "customSelector": "#main-content",
  "method": "GET",
  "headers": "{\"X-Test\":\"abc\"}",
  "cookie": "session=xyz",
  "proxy": "username:password@ip:port"
}

Optional Parameters

Field
Type
Description

jsRendering

boolean

Enable full-page JavaScript rendering.

customSelector

string

CSS selector (e.g., #main-content) to extract specific HTML.

method

string

HTTP method (GET, POST, PUT, PATCH)

user_agent_preset

string

Preset user-agent. Options: chrome, firefox, edge, opera, safari, ios-safari, android-chrome, custom

user_agent_custom

string

Used if user_agent_preset is custom

headers

string

JSON-formatted string of headers.

cookie

string

key=value;

proxy

string

username:password@ip:port

Pricing

  • $0.005 per successful task This is a pay-as-you-go pricing model — you're only charged when a Universal task successfully returns HTML or JSON content.

You can view your current credit balance and usage history in the Crawlbyte Dashboard.

Response

The response contains metadata about the task. For universal type, the most important fields are status and result.

{
  "id": "af3e12f2-8f45-43b0-8a7b-cabbbb94c1e9",
  "status": "completed",
  "result": "HTML or JSON RESULT_HERE"
}
  • If result is raw HTML or a JSON object, no further polling is needed — this is the final data.

Status Types

Status
Meaning

queued

Task was accepted and added to the processing queue.

processing

Task is currently running.

completed

Task finished successfully (emails were submitted to Beehiiv).

failed

Task encountered an error (e.g. bad proxy, invalid URL, etc.).

Polling

If the initial status is queued or processing, you should poll for task completion.

GET https://api.crawlbyte.ai/api/tasks/:id
  • You’ll receive the same structure with an updated status.

  • Only poll until you receive completed or failed.

  • Rendered pages (with jsRendering: true) may take slightly longer.

SDK Usage

You can run this task using any official Crawlbyte SDK:

Each SDK provides a simple way to:

  • Create the task

  • Poll for status

  • Handle the final result

Refer to the SDKs section for installation, examples, and setup instructions.

Notes

  • The universal scraper is ideal for public webpages that don’t require a login.

  • Crawlbyte handles fingerprinting, headers, JS execution, bot detection, and more automatically.

  • For complex or highly protected pages, the task may fail. If this happens:

    Book a free setup call — we’ll configure a custom scraper template for your target site at no cost.

  • jsRendering is optional but may be required for dynamic sites (e.g., React, Vue, Angular).

  • If a cookie string is provided, it’s used as-is — you are fully responsible for its usage in accordance with our Terms of Service.

Last updated