Yelp Scraper

Scrapes public data from Yelp business profiles or review listings. Crawlbyte handles fingerprinting, anti-bot protections, and pagination — so you can collect structured data from Yelp effortlessly.

Endpoint

POST https://api.crawlbyte.ai/api/tasks

Basic Configuration (Required)

Business Profile

{
  "type": "yelp",
  "input": [
    "https://www.yelp.com/biz/some-restaurant"
  ],
  "dataType": "profiles",
  "multithread": false
}

Reviews

{
  "type": "yelp",
  "input": [
    "https://www.yelp.com/biz/some-restaurant"
  ],
  "dataType": "reviews",
  "sortBy": "NEWEST_FIRST",
  "multithread": false
}

Parameters

Field
Type
Description

type

string

Always "yelp"

input

array

Array of Yelp business URLs

dataType

string

"profiles" or "reviews"

sortBy

string

Only for reviews — controls review order: • NEWEST_FIRSTOLDEST_FIRSTHIGHEST_RATEDLOWEST_RATED

multithread

boolean

Use true for faster processing with multiple threads

Advanced Configuration (Optional)

{
  "type": "yelp",
  "input": [
    "https://www.yelp.com/biz/some-restaurant"
  ],
  "dataType": "reviews",
  "sortBy": "NEWEST_FIRST",
  "user_agent_preset": "chrome",
  "headers": "{\"X-Test\":\"abc\"}",
  "cookie": "",
  "proxy": "username:password@ip:port"
}

Optional Parameters

Field
Type
Description

user_agent_preset

string

Preset user-agent. Options: chrome, firefox, edge, opera, safari, ios-safari, android-chrome, custom

user_agent_custom

string

Used if user_agent_preset is custom.

headers

string

JSON-formatted string of headers.

cookie

string

key=value;

proxy

string

username:password@ip:port

Pricing

  • $0.005 per successful task This is a pay-as-you-go pricing model — you're only charged when a Yelp task successfully returns listings or reviews.

You can view your current credit balance and usage history in the Crawlbyte Dashboard.

Response

The response contains metadata about the task. For the yelp type, the key fields in the response are status and result.

{
  "id": "af3e12f2-8f45-43b0-8a7b-cabbbb94c1e9",
  "status": "completed",
  "result": "JSON_RESULT_HERE"
}
  • If result is a hash, you must poll /api/tasks/:id to retrieve the full data.

  • If result is a JSON object, the data is already available — no polling needed.

Status Types

Status
Meaning

queued

Task was accepted and added to the processing queue.

processing

Task is currently running.

completed

Task finished successfully and data was collected from Yelp.

failed

Task encountered an error (e.g. bad proxy, invalid URL, etc.).

Polling

If the initial status is queued or processing, you should poll for task completion.

GET https://api.crawlbyte.ai/api/tasks/:id
  • You’ll receive the same structure with an updated status.

  • Only poll until you receive completed or failed.

  • Average time: 2–4 seconds, but longer for reviews (full scraping).

SDK Usage

You can run this task using any official Crawlbyte SDK:

Each SDK provides a simple way to:

  • Create the task

  • Poll for status

  • Handle the final result

Refer to the SDKs section for installation, examples, and setup instructions.

Notes

  • This task supports public Yelp business pages and review listings.

  • If dataType is set to "reviews", the sortBy field is required. Accepted values: NEWEST_FIRST, OLDEST_FIRST, HIGHEST_RATED, LOWEST_RATED.

  • When scraping reviews, Crawlbyte fetches all available reviews across all pages, which may take longer for listings with high volume.

  • Crawlbyte handles fingerprinting, bot detection, and pagination internally — no need to configure anything manually.

  • You can batch multiple business URLs using multithread: true.

Last updated