# Olostep API — Orthogonal API

> Pay-per-use API on Orthogonal. Each call is billed to your Orthogonal balance.
> Base API: `https://api.orth.sh/v1/run` · [llms.txt](https://orthogonal.com/llms.txt) · [browse all APIs](https://orthogonal.com/discover)

Olostep offers AI a way to search the web, extract structured data in real time and build custom research agents.

**Verified:** yes

## Access

**Run API:** `POST https://api.orth.sh/v1/run`
**Auth:** `Authorization: Bearer $ORTHOGONAL_API_KEY`
Get an API key at https://orthogonal.com/dashboard/settings/api-keys

Every call goes through the unified Run API: send the API `slug`, the endpoint `path`, and the `query`/`body` parameters. The response is `{ "success": true, "price": "<usd>", "data": { ... } }`.

## Endpoints

### Batch Items

Retrieves the list of items processed for a batch. You can then use the `retrieve_id` to get the content with the Retrieve Endpoint

`GET /v1/batches/{batch_id}/items`

**Cost:** Free

**Docs:** https://docs.olostep.com/api-reference/batches/items

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `batch_id` | string | Yes | Path parameter — substitute directly into the endpoint `path`. |

```bash
# Replace {batch_id} in "path" with real values before sending
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/batches/{batch_id}/items","method":"GET"}'
```

### Crawl Info

Fetches information about a specific crawl.

`GET /v1/crawls/{crawl_id}`

**Cost:** Free

**Docs:** https://docs.olostep.com/api-reference/crawls/info

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `crawl_id` | string | Yes | Path parameter — substitute directly into the endpoint `path`. |

```bash
# Replace {crawl_id} in "path" with real values before sending
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/crawls/{crawl_id}","method":"GET"}'
```

### Crawl Pages

Fetches the list of pages for a specific crawl.

`GET /v1/crawls/{crawl_id}/pages`

**Cost:** Free

**Docs:** https://docs.olostep.com/api-reference/crawls/pages

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `crawl_id` | string | Yes | Path parameter — substitute directly into the endpoint `path`. |

```bash
# Replace {crawl_id} in "path" with real values before sending
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/crawls/{crawl_id}/pages","method":"GET"}'
```

### Get Answer

This endpoint retrieves a previously completed answer by its ID.

`GET /v1/answers/{answer_id}`

**Cost:** Free

**Docs:** https://docs.olostep.com/api-reference/answers/get

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `answer_id` | string | Yes | Path parameter — substitute directly into the endpoint `path`. |

```bash
# Replace {answer_id} in "path" with real values before sending
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/answers/{answer_id}","method":"GET"}'
```

### Get Scrape

Can be used to retrieve response for a scrape.

`GET /v1/scrapes/{scrape_id}`

**Cost:** Free

**Docs:** https://docs.olostep.com/api-reference/scrapes/get

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `scrape_id` | string | Yes | Path parameter — substitute directly into the endpoint `path`. |

```bash
# Replace {scrape_id} in "path" with real values before sending
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/scrapes/{scrape_id}","method":"GET"}'
```

### Create Scrape

Initiate a web page scrape

`POST /v1/scrapes`

**Estimated cost:** $0.01

**Docs:** https://docs.olostep.com/api-reference/scrapes/create

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `url_to_scrape` | string | Yes | The URL to start scraping from. |
| `wait_before_scraping` | integer | No | Time to wait in milliseconds before starting the scraping. |
| `formats` | string[] | No | Formats in which you want the content. |
| `remove_css_selectors` | string | No | Option to remove certain CSS selectors from the content. Optionally, you can also pass a JSON stringified array of specific selectors you want to remove. The CSS selectors removed when this option is set to default are ['nav','footer','script','style','noscript','svg',[role=alert],[role=banner],[role=dialog],[role=alertdialog],[role=region][aria-label*=skip i],[aria-modal=true]] Available options: `default`, `none`, `array` |
| `actions` | object[] | No | Actions to perform on the page before getting the content. |
| `country` | string | No | Residential country to load the request from. Supported values are: * US (United States) * CA (Canada) * IT (Italy) * IN (India) * GB (England) * JP (Japan) * MX (Mexico) * AU (Australia) * ID (Indonesia) * UA (UAE) * RU (Russia) * RANDOM Some operations, like scraping Google Search and Google News, support all countries. |
| `transformer` | string | No | Specify the HTML transformer to use, if any. Postlight's Mercury Parser library is used to remove ads and other unwanted content from the scraped content. Available options: `postlight`, `none` |
| `remove_images` | boolean | No | Option to remove images from the scraped content. Defaults to false. |
| `remove_class_names` | string[] | No | List of class names to remove from the content. |
| `parser` | object | No | When defining json as a format, you can use this parameter to specify the parser to use. Parsers are useful to extract structured content from web pages. Olostep has a few parsers built in for most common web pages, and you can also create your own parsers. |
| `llm_extract` | object | No |  |
| `links_on_page` | object | No | With this option, you can get all the links present on the page you scrape. |
| `screen_size` | object | No | Configuration for screen size. Preset dimensions are available through screen_type: desktop (1920x1080), mobile (414x896), or default (768x1024). |
| `metadata` | object | No | User-defined metadata. Not supported yet |

```bash
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/scrapes","body":{"url_to_scrape":"<string>","wait_before_scraping":"<integer>","formats":"<string>","remove_css_selectors":"<string>","actions":"<object>","country":"<string>","transformer":"<string>","remove_images":"<boolean>","remove_class_names":"<string>","parser":"<object>","llm_extract":"<object>","links_on_page":"<object>","screen_size":"<object>","metadata":"<object>"}}'
```

### Create Answer

The AI will perform actions like searching and browsing web pages to find the answer to the provided task. Execution time is 3-30s depending upon complexity. For longer tasks, use the agent endpoint instead.

`POST /v1/answers`

**Estimated cost:** $0.05

**Docs:** https://docs.olostep.com/api-reference/answers/create

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `task` | string | Yes | The task to be performed. |
| `json_format` | object | No | The desired output JSON object with empty values as a schema, or simply describe the data you want as a string. |

```bash
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/answers","body":{"task":"<string>","json_format":"<object>"}}'
```

### Batch Info

Retrieves the status and progress information about a batch. To retrieve the content for a batch, see here

`GET /v1/batches/{batch_id}`

**Cost:** Free

**Docs:** https://docs.olostep.com/api-reference/batches/info

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `batch_id` | string | Yes | Path parameter — substitute directly into the endpoint `path`. |

```bash
# Replace {batch_id} in "path" with real values before sending
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/batches/{batch_id}","method":"GET"}'
```

### Retrieve Content

Retrieve page content of processed batches and crawls urls.

`GET /v1/retrieve`

**Cost:** Free

**Docs:** https://docs.olostep.com/api-reference/retrieve

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `retrieve_id` | string | Yes | The ID of the page content to retrieve. Available in the response of `/v1/crawls/{crawl_id}/pages`, `/v1/scrapes/{scrape_id}` or `/v1/batches/{batch_id}/items` endpoints |
| `formats` | string[] | No | Optional array to retrieve only specific formats in production. If not provided, all formats will be returned. |

```bash
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/retrieve","method":"GET","body":{"retrieve_id":"<string>","formats":"<string>"}}'
```

### Start Crawl

Starts a new crawl. You receive a `id` to track the progress. The operation may take 1-10 mins depending upon the site and depth and pages parameters.

`POST /v1/crawls`

**Estimated cost:** Dynamic — use `"dryRun": true` in the Run API request to check the exact cost before calling.

**Docs:** https://docs.olostep.com/api-reference/crawls/create

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `start_url` | string | Yes | The starting point of the crawl. |
| `max_pages` | number | Yes | Maximum number of pages to crawl. Recommended for most use cases like crawling an entire website. |
| `include_urls` | string[] | No | URL path patterns to include in the crawl using glob syntax. Defaults to `/**` which includes all URLs. Use patterns like `/blog/**` to crawl specific sections (e.g., only blog pages), `/products/*.html` for product pages, or multiple patterns for different sections. Supports standard glob features like * (any characters) and ** (recursive matching). |
| `exclude_urls` | string[] | No | URL path names in glob pattern to exclude. For example: `/careers/**`. Excluded URLs will supersede included URLs. |
| `max_depth` | number | No | Maximum depth of the crawl. Useful to extract only up to n-degree of links. |
| `include_external` | boolean | No | Crawl first-degree external links. |
| `include_subdomain` | boolean | No | Include subdomains of the website. `false` by default. |
| `search_query` | string | No | An optional search query to find specific links and also sort the results by relevance. |
| `top_n` | number | No | An optional number to only crawl the top N most relevant links on every page as per search query. |
| `webhook_url` | string | No | An optional POST request endpoint called when this crawl is completed. The body of the request will be same as the response of this [`v1/crawls/{crawl_id}`](./info#response-created) endpoint. |
| `timeout` | number | No | End the crawl after n seconds with the pages completed until then. May take ~10s extra from provided timeout. |

```bash
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/crawls","body":{"start_url":"<string>","max_pages":"<number>","include_urls":"<string>","exclude_urls":"<string>","max_depth":"<number>","include_external":"<boolean>","include_subdomain":"<boolean>","search_query":"<string>","top_n":"<number>","webhook_url":"<string>","timeout":"<number>"}}'
```

### Maps

This endpoint allows users to get all the urls on a certain website. It can take up to 120 seconds for complex websites. For large websites, results are paginated using cursor-based pagination

`POST /v1/maps`

**Estimated cost:** $0.01

**Docs:** https://docs.olostep.com/api-reference/maps/create

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `url` | string | Yes | The URL of the website for which you want the links |
| `search_query` | string | No | An optional search query to sort the links by search relevance. |
| `top_n` | number | No | An optional number to limit to only top n links for a search query. |
| `include_subdomain` | boolean | No | Include subdomains of the given URL. `true` by default. |
| `include_urls` | string[] | No | URL path patterns to include using glob syntax. For example: `/blog/**` to only include blog URLs. Only URLs matching these patterns will be returned. |
| `exclude_urls` | string[] | No | URL path patterns to exclude using glob syntax. For example: `/careers/**`. Excluded URLs will supersede included URLs. |
| `cursor` | string | No | OPTIONAL: Pagination cursor from a previous response. When provided, returns the next set of URLs from where the previous request left off due to response size limit. |

```bash
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/maps","body":{"url":"<string>","search_query":"<string>","top_n":"<number>","include_subdomain":"<boolean>","include_urls":"<string>","exclude_urls":"<string>","cursor":"<string>"}}'
```

### Start Batch

Starts a new batch. You receive an `id` that you can use to track the progress of the batch as shown [here](/api-reference/batches/info). Note: Processing time is constant regardless of batch size

`POST /v1/batches`

**Estimated cost:** Dynamic — use `"dryRun": true` in the Run API request to check the exact cost before calling.

**Docs:** https://docs.olostep.com/api-reference/batches/create

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `items` | object[] | Yes | Array of items to be processed in the batch. |
| `country` | string | No | Country for the batch execution. Provide in ISO 3166-1 alpha-2 codes like US(USA), IN(India), etc |
| `parser` | object | No | You can use this parameter to specify the parser to use. Parsers are useful to extract structured content from web pages. Olostep has a few parsers built in for most common web pages, and you can also create your own parsers. |
| `links_on_page` | object | No | Get all the links present on each page in the batch. |

```bash
curl -X POST 'https://api.orth.sh/v1/run' \
  -H 'Authorization: Bearer $ORTHOGONAL_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"api":"olostep","path":"/v1/batches","body":{"items":"<object>","country":"<string>","parser":"<object>","links_on_page":"<object>"}}'
```

---

Full details and an interactive quickstart: https://orthogonal.com/discover/olostep
