Want to buy Scraping Robot API for web scraping? Before that, you have to know how to use it. This article will give you a detailed guide.
Scraping Robot API
基本用法
This page will tell you about Scraping Robot's basic functionality
The Scraping Robot API exposes a single API endpoint. Simply send an http-request to https://api.scrapingrobot.com with your API key, passed as a query-parameter, and get the needed data.
All task-parameters are passed in a POST-body as a JSON-object.
https://api.scrapingrobot.com?token=<YOUR_SR_TOKEN>
Sample Code
{ "url": "https://www.scrapingrobot.com/", "module": "HtmlChromeScraper" }
认证
Every request to the Scraping Robot API should contain an access token. If a user does not pass the token or it is invalid – Scraping Robot will respond with status 401 (Unauthorized).
The token can be passed in two ways:
- As a query-parameter:
- ?token=
- In the Authorization header (bearer authentication):
- Authorization: Bearer
请注意: To receive your Scraping Robot API access token you must register an account at https://dashboard.scrapingrobot.com/sign-up.
Error Handling
Authorization Parameter Missing
If you forget to include your API key from your Scraping Robot dashboard as the “token” path parameter in your API call, we'll show you the error message on the right side of the screen.
Authorization Parameter Missing
{ "status": "FAIL", "date": "Thu, 01 Mar 2020 10:00:00 GMT", "error": "Token query parameter not found" }
To rectify this error, please include your API key from your Scraping Robot dashboard as the “token” path parameter in your API call, so that you can be authorized to make the request to our system.
Invalid Authorization Parameter
If you put an invalid API key as the “token” path parameter in your API call, we will show you the error message on the right side of the screen.
Invalid Authorization Parameter
{ "status": "FAIL", "date": "Thu, 01 Mar 2020 10:00:00 GMT", "error": "Invalid client token" }
To rectify this error, please include your API key from your Scraping Robot dashboard as the “token” path parameter in your API call, so that you can be authorized to make the request to our system.
Not Enough Credits
If you do not have enough credits in your account to perform the scraping task you're requesting, we will show you the error message on the right side of the screen.
Not Enough Credits
{ "status": "FAIL", "date": "Thu, 01 Mar 2020 10:00:00 GMT", "error": "You do not have enough credits" }
To rectify this error, please add scraping credits to your account through your Scraping Robot dashboard or contact our support team to get credits added to your account.
Request Body is not Valid JSON
If you send in a request that does not have valid JSON, we will show you the error message on the right side of the screen.
Invalid JSON includes anything that is not in JSON format and does not have the structure of JSON. This is different than the “Invalid Request Body” error explained next.
Request Body is not Valid JSON
{ "status": "FAIL", "date":"Thu, 01 Mar 2020 10:00:00 GMT", "error": "Request-body is not a valid JSON" }
To rectify this error, please fix the JSON in the request body and try your request again. You can find JSON validators online to help you with getting your JSON correctly formatted.
Invalid Request Body
If you send in a JSON body, but the body parameters are incorrect or out of order, we will show you the error message on the right side of the screen.
Invalid Request Body
{ "status": "FAIL", "date":"Thu, 01 Mar 2020 10:00:00 GMT", "error": "Invalid json format: ..." }
To rectify this error, please review the API documentation for the request you're sending in to ensure all parameters are correct. If you cannot resolve the issue on your own, please contact our support team for assistance and we'll gladly help out!
Service Overloaded
If our service is overloaded at the time of your request, we will show you the error message on the right side of the screen.
Service Overloaded
{ "status": "FAIL", "date":"Thu, 01 Mar 2020 10:00:00 GMT", "error": "The service is overloaded. Try again later" }
To rectify this error, please contact our support team for updates on the situation.
Page Load Failure
If Scraping Robot is unable to load the page you're trying to scrape, we will show you the error message on the right side of the screen.
Page Load Failure
{ "status": "FAIL", "date":"Thu, 01 Mar 2020 10:00:00 GMT", "error": "Please retry this task again later" }
To rectify this error, please contact our support team for assistance.
Internal Error
If there is an issue with our system, we will show you the error message on the right side of the screen.
Internal Error
{ "error": "Internal server error", "date": "Thu, 01 Mar 2020 10:00:00 GMT", "status": "FAIL" }
To rectify this error, please contact our support team for assistance.
Credits
Get Credit Balance
获取 https://api.scrapingrobot.com/balance?token=<YOUR_SR_TOKEN>
要求
curl --request GET \ --url https://api.scrapingrobot.com/balance \ --header 'Accept: application/json'
要求
const sdk = require('api')('@staging-docs-scrapingrobot/v1.0#1b382ykufox5uf'); sdk.get('/balance') .then(res => console.log(res)) .catch(err => console.error(err));
要求
<?php require_once('vendor/autoload.php'); $client = new \GuzzleHttp\Client(); $response = $client->request('GET', 'https://api.scrapingrobot.com/balance', [ 'headers' => [ 'Accept' => 'application/json', ], ]); echo $response->getBody();
要求
import requests url = "https://api.scrapingrobot.com/balance" headers = {"Accept": "application/json"} response = requests.request("GET", url, headers=headers) print(response.text)
要求
const options = {method: 'GET', headers: {Accept: 'application/json'}}; fetch('https://api.scrapingrobot.com/balance', options) .then(response => response.json()) .then(response => console.log(response)) .catch(err => console.error(err));
要求
CURL *hnd = curl_easy_init(); curl_easy_setopt(hnd, CURLOPT_CUSTOMREQUEST, "GET"); curl_easy_setopt(hnd, CURLOPT_URL, "https://api.scrapingrobot.com/balance"); struct curl_slist *headers = NULL; headers = curl_slist_append(headers, "Accept: application/json"); curl_easy_setopt(hnd, CURLOPT_HTTPHEADER, headers); CURLcode ret = curl_easy_perform(hnd);
Get Credit Usage Statistics
This endpoint will show you your credit usage over the specified time period.
要求
curl --request GET \
--url 'https://api.scrapingrobot.com/balance?token=<YOUR_SR_TOKEN>' \
--header 'Accept: application/json'
要求
const fetch = require('node-fetch');
const url = 'https://api.scrapingrobot.com/balance?token=';
const options = {method: 'GET', headers: {Accept: 'application/json'}};
fetch(url, options)
.then(res => res.json())
.then(json => console.log(json))
.catch(err => console.error('error:' + err));
要求
<?php
require_once('vendor/autoload.php');
$client = new \GuzzleHttp\Client();
$response = $client->request('GET', 'https://api.scrapingrobot.com/balance?token=<YOUR_SR_TOKEN>', [
'headers' => [
'Accept' => 'application/json',
],
]);
echo $response->getBody();
要求
import requests url = "https://api.scrapingrobot.com/balance?token=<YOUR_SR_TOKEN>" headers = {"Accept": "application/json"} response = requests.request("GET", url, headers=headers) print(response.text)
要求
const options = {method: 'GET', headers: {Accept: 'application/json'}}; fetch('https://api.scrapingrobot.com/balance?token=<YOUR_SR_TOKEN>', options) .then(response => response.json()) .then(response => console.log(response)) .catch(err => console.error(err));
要求
CURL *hnd = curl_easy_init(); curl_easy_setopt(hnd, CURLOPT_CUSTOMREQUEST, "GET"); curl_easy_setopt(hnd, CURLOPT_URL, "https://api.scrapingrobot.com/balance?token=<YOUR_SR_TOKEN>"); struct curl_slist *headers = NULL; headers = curl_slist_append(headers, "Accept: application/json"); curl_easy_setopt(hnd, CURLOPT_HTTPHEADER, headers); CURLcode ret = curl_easy_perform(hnd);
Html API
Create a task using POST-request
/
要求
curl --request GET \ --url 'https://api.scrapingrobot.com/stats?token=<YOUR_SR_TOKEN>&type=daily' \ --header 'Accept: application/json'
要求
const fetch = require('node-fetch'); const url = 'https://api.scrapingrobot.com/stats?token=<YOUR_SR_TOKEN>&type=daily'; const options = {method: 'GET', headers: {Accept: 'application/json'}}; fetch(url, options) .then(res => res.json()) .then(json => console.log(json)) .catch(err => console.error('error:' + err));
要求
<?php require_once('vendor/autoload.php'); $client = new \GuzzleHttp\Client(); $response = $client->request('GET', 'https://api.scrapingrobot.com/stats?token=<YOUR_SR_TOKEN>&type=daily', [ 'headers' => [ 'Accept' => 'application/json', ], ]); echo $response->getBody();
要求
import requests url = "https://api.scrapingrobot.com/stats?token=<YOUR_SR_TOKEN>&type=daily" headers = {"Accept": "application/json"} response = requests.request("GET", url, headers=headers) print(response.text)
要求
const options = {method: 'GET', headers: {Accept: 'application/json'}}; fetch('https://api.scrapingrobot.com/stats?token=<YOUR_SR_TOKEN>&type=daily', options) .then(response => response.json()) .then(response => console.log(response)) .catch(err => console.error(err));
要求
CURL *hnd = curl_easy_init(); curl_easy_setopt(hnd, CURLOPT_CUSTOMREQUEST, "GET"); curl_easy_setopt(hnd, CURLOPT_URL, "https://api.scrapingrobot.com/stats?token=<YOUR_SR_TOKEN>&type=daily"); struct curl_slist *headers = NULL; headers = curl_slist_append(headers, "Accept: application/json"); curl_easy_setopt(hnd, CURLOPT_HTTPHEADER, headers); CURLcode ret = curl_easy_perform(hnd);
Create a task using GET-request
Create scraping task
要求
curl --request POST \ --url 'https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true&contentType=text%2Fplain' \ --header 'Accept: application/json' \ --header 'Content-Type: application/json'
要求
const sdk = require('api')('@staging-docs-scrapingrobot/v1.0#1b382ykufox5uf'); sdk.server('https:/api.scrapingrobot.com/'); sdk.post('', { responseType: 'json', waitUntil: 'load', noScripts: 'false', noImages: 'true', noFonts: 'true', noCss: 'true', contentType: 'text%2Fplain' }) .then(res => console.log(res)) .catch(err => console.error(err));
要求
<?php require_once('vendor/autoload.php'); $client = new \GuzzleHttp\Client(); $response = $client->request('POST', 'https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true&contentType=text%2Fplain', [ 'headers' => [ 'Accept' => 'application/json', 'Content-Type' => 'application/json', ], ]); echo $response->getBody();
要求
import requests url = "https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true&contentType=text%2Fplain" headers = { "Accept": "application/json", "Content-Type": "application/json" } response = requests.request("POST", url, headers=headers) print(response.text)
要求
const options = { method: 'POST', headers: {Accept: 'application/json', 'Content-Type': 'application/json'} }; fetch('https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true&contentType=text%2Fplain', options) .then(response => response.json()) .then(response => console.log(response)) .catch(err => console.error(err));
要求
CURL *hnd = curl_easy_init(); curl_easy_setopt(hnd, CURLOPT_CUSTOMREQUEST, "POST"); curl_easy_setopt(hnd, CURLOPT_URL, "https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true&contentType=text%2Fplain"); struct curl_slist *headers = NULL; headers = curl_slist_append(headers, "Accept: application/json"); headers = curl_slist_append(headers, "Content-Type: application/json"); curl_easy_setopt(hnd, CURLOPT_HTTPHEADER, headers); CURLcode ret = curl_easy_perform(hnd);
Get current credits balance
Get current credits balance
要求
curl --request GET \ --url https://api.scrapingrobot.com/ \ --header 'Accept: application/json'
要求
const sdk = require('api')('@staging-docs-scrapingrobot/v1.0#1b382ykufox5uf'); sdk.server('https:/api.scrapingrobot.com/'); sdk.get('', { responseType: 'json', waitUntil: 'load', noScripts: 'false', noImages: 'true', noFonts: 'true', noCss: 'true' }) .then(res => console.log(res)) .catch(err => console.error(err));
要求
<?php require_once('vendor/autoload.php'); $client = new \GuzzleHttp\Client(); $response = $client->request('GET', 'https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true', [ 'headers' => [ 'Accept' => 'application/json', ], ]); echo $response->getBody();
要求
import requests url = "https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true" headers = {"Accept": "application/json"} response = requests.request("GET", url, headers=headers) print(response.text)
要求
const options = {method: 'GET', headers: {Accept: 'application/json'}}; fetch('https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true', options) .then(response => response.json()) .then(response => console.log(response)) .catch(err => console.error(err));
要求
CURL *hnd = curl_easy_init();
curl_easy_setopt(hnd, CURLOPT_CUSTOMREQUEST, "GET"); curl_easy_setopt(hnd, CURLOPT_URL, "https://api.scrapingrobot.com/?responseType=json&waitUntil=load&noScripts=false&noImages=true&noFonts=true&noCss=true"); struct curl_slist *headers = NULL; headers = curl_slist_append(headers, "Accept: application/json"); curl_easy_setopt(hnd, CURLOPT_HTTPHEADER, headers); CURLcode ret = curl_easy_perform(hnd);
参考资料
免责声明:这部分内容主要来自商家。如果商家不希望在我的网站上显示,请 联系我们 删除您的内容。
最后更新于 5 月 15, 2022