# Amtrak Scraper

## Endpoint

```
POST https://api.crawlbyte.ai/api/tasks
```

## Basic Configuration (Required)

```json
{
  "type": "amtrak",
  "input": [
    "{\"origin\":\"NYP\",\"destination\":\"PHL\",\"departure\":\"2025-08-01\",\"return\":\"2025-08-08\",\"passengers\":{\"adult\":1,\"seniors\":1,\"youth\":1,\"child\":1,\"infant\":1}}"
  ],
  "multithread": false
}
```

### Parameters

| Field       | Type    | Description                                                                                                                                 |
| ----------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
| type        | string  | Always `"amtrak"`                                                                                                                           |
| input       | array   | Array of JSON strings containing route, date, and passenger details (e.g., origin, destination, departure, return, and passenger breakdown) |
| multithread | boolean | Use `true` for faster processing with multiple threads                                                                                      |

## Input Builder Notes

You must structure the input as a JSON string and insert it into the `input` array. Use the following fields:

| Parameter   | Meaning                     | Example                                                  |
| ----------- | --------------------------- | -------------------------------------------------------- |
| origin      | Origin Station Code         | NYP                                                      |
| destination | Destination Station Code    | PHL                                                      |
| departure   | Departure Date (YYYY-MM-DD) | 2025-08-01                                               |
| return      | Return Date *(optional)*    | 2025-08-08                                               |
| passengers  | Passenger Count by Type     | `{"adult":1,"seniors":1,"youth":1,"child":1,"infant":1}` |

* All values must be wrapped in a **JSON string** (not object) inside the array.
* `return` is optional – omit it for one-way trips.

## Advanced Configuration (Optional)

```json
{
  "type": "amtrak",
  "input": [
    "{\"origin\":\"NYP\",\"destination\":\"PHL\",\"departure\":\"2025-08-01\",\"return\":\"2025-08-08\",\"passengers\":{\"adult\":1,\"seniors\":1,\"youth\":1,\"child\":1,\"infant\":1}}"
  ],
  "user_agent_preset": "chrome",
  "user_agent_custom": "",
  "headers": "{\"X-Test\":\"abc\"}",
  "cookie": "session=xyz",
  "proxy": "http://username:password@ip:port"
}
```

### Optional Parameters

| Field               | Type   | Description                                                                                                          |
| ------------------- | ------ | -------------------------------------------------------------------------------------------------------------------- |
| user\_agent\_preset | string | Preset user-agent. Options: `chrome`, `firefox`, `edge`, `opera`, `safari`, `ios-safari`, `android-chrome`, `custom` |
| user\_agent\_custom | string | Used if `user_agent_preset` is `custom.`                                                                             |
| headers             | string | JSON-formatted string of headers.                                                                                    |
| cookie              | string | `key=value;`                                                                                                         |
| proxy               | string | `http://username:password@ip:port`                                                                                   |

## Pricing

* **$0.01 per successful task**\
  This is a pay-as-you-go pricing model — you're only charged when an Amtrak task successfully returns train availability and fare data.

You can view your current credit balance and usage history in the [Crawlbyte Dashboard](https://dash.crawlbyte.ai/).

## Response

The response contains metadata about the task. For `amtrak` type, the most important fields are `status` and `result`.

```json
{
  "id": "bd3e89ed-815e-4395-98a3-521ede71cc4d",
  "status": "completed",
  "result": {
    // Parsed availability and pricing data
  }
}
```

* `result` is a **JSON object** containing the final scraped train availability and fare data — no further polling is needed.

### Status Types

| Status     | Meaning                                                                       |
| ---------- | ----------------------------------------------------------------------------- |
| queued     | Task was accepted and added to the processing queue.                          |
| processing | Task is currently running.                                                    |
| completed  | Task finished and train data was successfully collected.                      |
| failed     | Task encountered an error (e.g., invalid input, no results, or system issue). |

## Polling

If `status` is `queued` or `processing`, continue polling the task until it's completed or failed.

```
GET https://api.crawlbyte.ai/api/tasks/:id
```

* You’ll receive the same structure with an updated `status`.
* Only poll until you receive `completed` or `failed`.
* Recommended interval: **every 2–4 seconds**.

## SDK Usage

You can run this task using any official **Crawlbyte SDK**:

* [Go SDK](https://github.com/crawlbyte/crawlbyte-sdk-go)
* [TypeScript SDK](https://github.com/crawlbyte/crawlbyte-sdk-ts)
* [Python SDK](https://github.com/crawlbyte/crawlbyte-sdk-py)

Each SDK provides a simple way to:

* Create the task
* Poll for status
* Handle the final result

Refer to the [SDKs section](https://developers.crawlbyte.ai/sdks) for installation, examples, and setup instructions.

## Notes

* Only valid input objects with correct station codes and date formats will return results.
* Crawlbyte handles retries, rendering, fingerprinting, and anti-bot logic internally — no need to manage it yourself.
* Use `multithread: true` in advanced settings if running large volumes.
* Ensure all required fields like `origin`, `destination`, `departure`, and `passengers` are properly structured.
* The Amtrak response includes **all relevant train data**, including schedule and fare breakdowns.
* Average task duration is **\~8 seconds**, primarily due to Amtrak’s slower API response — this is expected and fully supported.
