Olostep Scrape endpoint
Trusted by teams worldwide




One HTTP call — no headless browser fleet to run yourself.
Step 1
POST the page you want plus formats: markdown, html, text, json, screenshot, and more.
Step 2
Handles dynamic sites, actions, PDFs, and optional parsers or LLM extraction for structured JSON.
Step 3
Receive markdown_content, html_content, json_content, hosted URLs, links_on_page, and metadata.
POST /v1/scrapes
{
"url_to_scrape": "https://en.wikipedia.org/wiki/Alexander_the_Great",
"formats": ["markdown", "html"]
}The /v1/scrapes API is designed for apps that need reliable page extraction at scale.
Request markdown, html, text, json, raw_pdf, or screenshot in one call.
Use pre-built parsers for popular sites or define a schema for LLM-powered extraction.
Get hosted URLs for large payloads alongside inline content fields.
One POST with url_to_scrape and formats — integrate in minutes.
Power enrichment, monitoring, and AI workflows from live web pages.
Chunk clean markdown or JSON into embeddings for search and Q&A.
Scrape product pages on a schedule and diff structured fields.
Pull structured contact or company data from landing pages.
One API call. Real content.
Mandatory: url_to_scrape and formats. Add parsers or llm_extract when you need structured JSON.
{
"url_to_scrape": "https://en.wikipedia.org/wiki/Alexander_the_Great",
"formats": ["markdown", "html"]
}{
"id": "scrape_…",
"object": "scrape",
"url_to_scrape": "https://en.wikipedia.org/wiki/…",
"result": {
"markdown_content": "## Alexander the Great…",
"html_content": "<html …>",
"json_content": null,
"page_metadata": { "status_code": 200, "title": "…" }
}
}Everything you need to know about the Scrape endpoint.