Amtrak Scraper
Scrapes train availability and pricing from Amtrak’s booking system using structured input parameters like route and travel dates.
Endpoint
POST https://api.crawlbyte.ai/api/tasks
Basic Configuration (Required)
{
"type": "amtrak",
"input": [
"{\"origin\":\"NYP\",\"destination\":\"PHL\",\"departure\":\"2025-08-01\",\"return\":\"2025-08-08\",\"passengers\":{\"adult\":1,\"seniors\":1,\"youth\":1,\"child\":1,\"infant\":1}}"
],
"multithread": false
}
Parameters
type
string
Always "amtrak"
input
array
Array of JSON strings containing route, date, and passenger details (e.g., origin, destination, departure, return, and passenger breakdown)
multithread
boolean
Use true
for faster processing with multiple threads
Input Builder Notes
You must structure the input as a JSON string and insert it into the input
array. Use the following fields:
origin
Origin Station Code
NYP
destination
Destination Station Code
PHL
departure
Departure Date (YYYY-MM-DD)
2025-08-01
return
Return Date (optional)
2025-08-08
passengers
Passenger Count by Type
{"adult":1,"seniors":1,"youth":1,"child":1,"infant":1}
All values must be wrapped in a JSON string (not object) inside the array.
return
is optional – omit it for one-way trips.
Advanced Configuration (Optional)
{
"type": "amtrak",
"input": [
"{\"origin\":\"NYP\",\"destination\":\"PHL\",\"departure\":\"2025-08-01\",\"return\":\"2025-08-08\",\"passengers\":{\"adult\":1,\"seniors\":1,\"youth\":1,\"child\":1,\"infant\":1}}"
],
"user_agent_preset": "chrome",
"user_agent_custom": "",
"headers": "{\"X-Test\":\"abc\"}",
"cookie": "session=xyz",
"proxy": "username:password@ip:port"
}
Optional Parameters
user_agent_preset
string
Preset user-agent. Options: chrome
, firefox
, edge
, opera
, safari
, ios-safari
, android-chrome
, custom
user_agent_custom
string
Used if user_agent_preset
is custom.
headers
string
JSON-formatted string of headers.
cookie
string
key=value;
proxy
string
username:password@ip:port
Pricing
$0.01 per successful task This is a pay-as-you-go pricing model — you're only charged when an Amtrak task successfully returns train availability and fare data.
You can view your current credit balance and usage history in the Crawlbyte Dashboard.
Response
The response contains metadata about the task. For amtrak
type, the most important fields are status
and result
.
{
"id": "bd3e89ed-815e-4395-98a3-521ede71cc4d",
"status": "completed",
"result": {
// Parsed availability and pricing data
}
}
result
is a JSON object containing the final scraped train availability and fare data — no further polling is needed.
Status Types
queued
Task was accepted and added to the processing queue.
processing
Task is currently running.
completed
Task finished and train data was successfully collected.
failed
Task encountered an error (e.g., invalid input, no results, or system issue).
Polling
If status
is queued
or processing
, continue polling the task until it's completed or failed.
GET https://api.crawlbyte.ai/api/tasks/:id
You’ll receive the same structure with an updated
status
.Only poll until you receive
completed
orfailed
.Recommended interval: every 2–4 seconds.
SDK Usage
You can run this task using any official Crawlbyte SDK:
Each SDK provides a simple way to:
Create the task
Poll for status
Handle the final result
Refer to the SDKs section for installation, examples, and setup instructions.
Notes
Only valid input objects with correct station codes and date formats will return results.
Crawlbyte handles retries, rendering, fingerprinting, and anti-bot logic internally — no need to manage it yourself.
Use
multithread: true
in advanced settings if running large volumes.Ensure all required fields like
origin
,destination
,departure
, andpassengers
are properly structured.The Amtrak response includes all relevant train data, including schedule and fare breakdowns.
Average task duration is ~8 seconds, primarily due to Amtrak’s slower API response — this is expected and fully supported.
Last updated