> For the complete documentation index, see [llms.txt](https://developers.crawlbyte.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://developers.crawlbyte.ai/har-scraper.md).

# HAR Scraper

## Endpoint

```
POST https://api.crawlbyte.ai/api/tasks
```

## Basic Configuration (Required)

```json
{
  "type": "har",
  "input": [
    "{\"city\":\"Dallas\",\"priceMin\":10000,\"priceMax\":50000}"
  ],
  "sortBy": "CHEAPEST_FIRST"
}
```

### Parameters

| Field       | Type    | Description                                                                                                                                                        |
| ----------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| type        | string  | Always `"har"`                                                                                                                                                     |
| input       | array   | Array of JSON-formatted strings with city and optional price filters                                                                                               |
| sortBy      | string  | <p>Controls result order:<br>• <code>NEWEST\_FIRST</code><br>• <code>OLDEST\_FIRST</code><br>• <code>CHEAPEST\_FIRST</code><br>• <code>EXPENSIVE\_FIRST</code></p> |
| multithread | boolean | Use `true` for faster processing with multiple threads                                                                                                             |

## Input Object Format

```
"{\"city\":\"Houston\",\"priceMin\":100000,\"priceMax\":200000}"
```

* `city` (required): Target city name
* `priceMin` (optional): Minimum price
* `priceMax` (optional): Maximum price

You may use only `priceMin` or `priceMax` if desired.

## Advanced Configuration (Optional)

```json
{
  "type": "har",
  "input": [
    "{\"city\":\"Dallas\",\"priceMin\":10000,\"priceMax\":50000}"
  ],
  "sortBy": "CHEAPEST_FIRST",
  "user_agent_preset": "chrome",
  "user_agent_custom": "",
  "headers": "{\"X-Test\":\"abc\"}",
  "cookie": "session=xyz",
  "proxy": "http://username:password@ip:port"
}
```

### Optional Parameters

| Field               | Type   | Description                                                                                                          |
| ------------------- | ------ | -------------------------------------------------------------------------------------------------------------------- |
| user\_agent\_preset | string | Preset user-agent. Options: `chrome`, `firefox`, `edge`, `opera`, `safari`, `ios-safari`, `android-chrome`, `custom` |
| user\_agent\_custom | string | Used if `user_agent_preset` is `custom.`                                                                             |
| headers             | string | JSON-formatted string of headers.                                                                                    |
| cookie              | string | `key=value;`                                                                                                         |
| proxy               | string | `http://username:password@ip:port`                                                                                   |

## Pricing

* **$0.005 per successful task**\
  This is a pay-as-you-go pricing model — you're only charged when a HAR task successfully returns listings.

You can view your current credit balance and usage history in the [Crawlbyte Dashboard](https://dash.crawlbyte.ai/).

## Response

The response contains metadata about the task. For `har` type, the most important fields are `status` and `result`.

```json
{
  "id": "bd3e89ed-815e-4395-98a3-521ede71cc4d",
  "status": "completed",
  "result": {
    // all listings here
  }
}
```

* `result` is a **JSON object,** that's the final scraped results — no further polling is needed.

### Status Types

| Status     | Meaning                                              |
| ---------- | ---------------------------------------------------- |
| queued     | Task was accepted and added to the processing queue. |
| processing | Task is currently running.                           |
| completed  | Task finished and listings were collected.           |
| failed     | Task encountered an error (invalid URL, etc.).       |

## Polling

If `status` is `queued` or `processing`, continue polling the task until it's completed or failed.

```
GET https://api.crawlbyte.ai/api/tasks/:id
```

* You’ll receive the same structure with an updated `status`.
* Only poll until you receive `completed` or `failed`.
* Recommended interval: **every 3–5 seconds**.

## SDK Usage

You can run this task using any official **Crawlbyte SDK**:

* [Go SDK](https://github.com/crawlbyte/crawlbyte-sdk-go)
* [TypeScript SDK](https://github.com/crawlbyte/crawlbyte-sdk-ts)
* [Python SDK](https://github.com/crawlbyte/crawlbyte-sdk-py)

Each SDK provides a simple way to:

* Create the task
* Poll for status
* Handle the final result

Refer to the [SDKs section](/sdks.md) for installation, examples, and setup instructions.

## Notes

* Input values must be valid — misspelled cities or malformed price fields will result in empty results or failure.
* The `input` must always be a JSON string, even if it contains only the city field.
* `sortBy` is required and must exactly match one of the supported options.
* Crawlbyte automatically handles pagination, anti-bot protections, and request sequencing.
* `multithread: true` can be used in advanced settings for faster bulk processing.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://developers.crawlbyte.ai/har-scraper.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
