Create schedule
This endpoint let’s you create a schedule with a scraper, a title, a cron expression, a list of URLs which the scraper should scrape, and a list of webhooks which the data should be send to.
This endpoint will return the schedule object to you, which is just an id and the data that you passed in. You can see the results of all previous schedules in the Sonata UI (an API endpoint for this is coming soon).
You can download CSVs or JSON files of the scraped data from the Sonata UI, as well as getting it send to a webhook.
Webhooks
When the schedule completes, the scraped data will be sent to the webhook you specified in the request body. The data will be sent as a JSON object with the following fields:
id
(string): The ID of the schedule.data
(list of objects): The scraped data.
The data will be sent as a POST request. If the webhook returns a 200 status code, the schedule will be marked as complete. If the webhook returns a 500 status code, the schedule will be marked as failed.
The webhook request will have the Authorization
header set to Bearer ${sonata-webhook-key}
where sonata-webhook-key
is your webhook key that you can find in the settings section of the Sonata UI.
Request structure
The request body should be a JSON object with the following fields:
title
(string): The title of the schedule.scraper_id
(string): The ID of the scraper you want to use.cron_expression
(string): The cron expression for the schedule.urls
(list of strings): A list of URLs that the scraper should scrape.
For example:
Response
The response body will be a JSON object with the following fields:
id
(string): The ID of the schedule.title
(string): The title of the schedule.scraper_id
(string): The ID of the scraper.cron_expression
(string): The cron expression for the schedule.urls
(list of strings): A list of URLs that the scraper should scrape.
For example: