Create a new scraper
This endpoint creates a new scraper. You pass a title, a list of test URLs and the JSON schema of the data you expect to get back from the scraper.
This endpoint will return the scraper object to you, but the status of the scraper will be compiling
until the scraper has been compiled and is ready to run. Compilation usually takes a few minutes, and you can check the status of the scraper by calling the /scrapers/{scraper_id}
endpoint. During the compilation process we’re using a load of LLMs to figure out the best way to scrape the data you’re looking for.
Request structure
The request body should be a JSON object with the following fields:
title
(string): The title of the scraper.test_urls
(list of strings): A list of URLs that the scraper should be able to scrape.schema
(object): The JSON schema of the data you expect to get back from the scraper.
For example:
Be descriptive in your schema because this is seen by the LLMs during the compilation process and will help them figure out the best way to scrape the data you’re looking for.
Response
The response body will be a JSON object with the following fields:
id
(string): The ID of the scraper.title
(string): The title of the scraper.status
(string): The status of the scraper, which is one ofcompiling
,ready
,failed
,healing
.
For example: