# Amtrak Scraper

## Endpoint

```
POST https://api.crawlbyte.ai/api/tasks
```

## Basic Configuration (Required)

```json
{
  "type": "amtrak",
  "input": [
    "{\"origin\":\"NYP\",\"destination\":\"PHL\",\"departure\":\"2025-08-01\",\"return\":\"2025-08-08\",\"passengers\":{\"adult\":1,\"seniors\":1,\"youth\":1,\"child\":1,\"infant\":1}}"
  ],
  "multithread": false
}
```

### Parameters

| Field       | Type    | Description                                                                                                                                 |
| ----------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
| type        | string  | Always `"amtrak"`                                                                                                                           |
| input       | array   | Array of JSON strings containing route, date, and passenger details (e.g., origin, destination, departure, return, and passenger breakdown) |
| multithread | boolean | Use `true` for faster processing with multiple threads                                                                                      |

## Input Builder Notes

You must structure the input as a JSON string and insert it into the `input` array. Use the following fields:

| Parameter   | Meaning                     | Example                                                  |
| ----------- | --------------------------- | -------------------------------------------------------- |
| origin      | Origin Station Code         | NYP                                                      |
| destination | Destination Station Code    | PHL                                                      |
| departure   | Departure Date (YYYY-MM-DD) | 2025-08-01                                               |
| return      | Return Date *(optional)*    | 2025-08-08                                               |
| passengers  | Passenger Count by Type     | `{"adult":1,"seniors":1,"youth":1,"child":1,"infant":1}` |

* All values must be wrapped in a **JSON string** (not object) inside the array.
* `return` is optional – omit it for one-way trips.

## Advanced Configuration (Optional)

```json
{
  "type": "amtrak",
  "input": [
    "{\"origin\":\"NYP\",\"destination\":\"PHL\",\"departure\":\"2025-08-01\",\"return\":\"2025-08-08\",\"passengers\":{\"adult\":1,\"seniors\":1,\"youth\":1,\"child\":1,\"infant\":1}}"
  ],
  "user_agent_preset": "chrome",
  "user_agent_custom": "",
  "headers": "{\"X-Test\":\"abc\"}",
  "cookie": "session=xyz",
  "proxy": "http://username:password@ip:port"
}
```

### Optional Parameters

| Field               | Type   | Description                                                                                                          |
| ------------------- | ------ | -------------------------------------------------------------------------------------------------------------------- |
| user\_agent\_preset | string | Preset user-agent. Options: `chrome`, `firefox`, `edge`, `opera`, `safari`, `ios-safari`, `android-chrome`, `custom` |
| user\_agent\_custom | string | Used if `user_agent_preset` is `custom.`                                                                             |
| headers             | string | JSON-formatted string of headers.                                                                                    |
| cookie              | string | `key=value;`                                                                                                         |
| proxy               | string | `http://username:password@ip:port`                                                                                   |

## Pricing

* **$0.01 per successful task**\
  This is a pay-as-you-go pricing model — you're only charged when an Amtrak task successfully returns train availability and fare data.

You can view your current credit balance and usage history in the [Crawlbyte Dashboard](https://dash.crawlbyte.ai/).

## Response

The response contains metadata about the task. For `amtrak` type, the most important fields are `status` and `result`.

```json
{
  "id": "bd3e89ed-815e-4395-98a3-521ede71cc4d",
  "status": "completed",
  "result": {
    // Parsed availability and pricing data
  }
}
```

* `result` is a **JSON object** containing the final scraped train availability and fare data — no further polling is needed.

### Status Types

| Status     | Meaning                                                                       |
| ---------- | ----------------------------------------------------------------------------- |
| queued     | Task was accepted and added to the processing queue.                          |
| processing | Task is currently running.                                                    |
| completed  | Task finished and train data was successfully collected.                      |
| failed     | Task encountered an error (e.g., invalid input, no results, or system issue). |

## Polling

If `status` is `queued` or `processing`, continue polling the task until it's completed or failed.

```
GET https://api.crawlbyte.ai/api/tasks/:id
```

* You’ll receive the same structure with an updated `status`.
* Only poll until you receive `completed` or `failed`.
* Recommended interval: **every 2–4 seconds**.

## SDK Usage

You can run this task using any official **Crawlbyte SDK**:

* [Go SDK](https://github.com/crawlbyte/crawlbyte-sdk-go)
* [TypeScript SDK](https://github.com/crawlbyte/crawlbyte-sdk-ts)
* [Python SDK](https://github.com/crawlbyte/crawlbyte-sdk-py)

Each SDK provides a simple way to:

* Create the task
* Poll for status
* Handle the final result

Refer to the [SDKs section](/sdks.md) for installation, examples, and setup instructions.

## Notes

* Only valid input objects with correct station codes and date formats will return results.
* Crawlbyte handles retries, rendering, fingerprinting, and anti-bot logic internally — no need to manage it yourself.
* Use `multithread: true` in advanced settings if running large volumes.
* Ensure all required fields like `origin`, `destination`, `departure`, and `passengers` are properly structured.
* The Amtrak response includes **all relevant train data**, including schedule and fare breakdowns.
* Average task duration is **\~8 seconds**, primarily due to Amtrak’s slower API response — this is expected and fully supported.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developers.crawlbyte.ai/amtrak-scraper.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
