Oxylabs Real-Time Crawler

如何使用 HTML 爬虫 API [第 5 部分]:用于其他网站的 OxyLabs 实时抓取器

Do you know how to use OxyLabs Real-time Crawler for web pages? This is the most comprehensive introduction from OxyLabs official.

快速入门

HTML Crawler API is built to help you in your heavy-duty data retrieval operations. You can use HTML Crawler API to access various public pages. It enables effortless web data extraction without any delays or errors.

HTML Crawler API uses 基本 HTTP 身份验证 需要发送用户名和密码。

This is by far the fastest way to start using HTML Crawler API. You will make a request to https://ip.oxylabs.io 使用 实时 整合方法。不要忘记替换 用户名 和 密码 使用代理用户凭据。

curl --user "USERNAME:PASSWORD" 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{"source": "universal", "url": "https://ip.oxylabs.io"}'

如果您有任何本文件未涉及的问题,请联系您的客户经理或我们的支持人员,地址是 [email protected].


整合方法

HTML Crawler API supports three integration methods which have their unique benefits:

  • 推拉式.使用这种方法,现在需要与我们的端点保持活动连接,以检索数据。在发出请求后,我们的系统会在任务完成后自动 ping 用户服务器(请参阅 回调).这种方法可以节省计算资源,而且易于扩展。
  • 实时.该方法要求用户与我们的端点保持活动连接,以便在任务完成时成功获取结果。这种方法可以在一个服务中实现,而推拉法则需要两个步骤。
  • 超级用户接口.这种方法与实时方法非常相似,但用户可以使用 HTML Cralwer 作为代理,而不是向我们的端点发布数据。要检索数据,用户必须设置一个代理端点,并向所需的 URL 发送 GET 请求。必须使用标头添加其他参数。

我们推荐的数据提取方法是 推拉式.


推拉式

这是最简单、最可靠、最值得推荐的数据传输方法。在推拉式方案中,您向我们发送查询,我们向您返回工作 本我工作完成后,您可以使用 本我 中检索内容 /结果 端点。你可以自己检查作业完成状态,也可以设置一个能接受 POST 查询的简单监听器。

This way, we will send you a callback message once the job is ready to be retrieved. In this particular example the results will be automatically 上传到您的 S3 存储桶 名为 您的邮筒名称.


单一查询

以下端点将处理对一个关键字或 URL 的单次查询。API 将返回一条确认信息,其中包含任务信息,包括任务 本我.您可以使用它来检查任务完成状态 本我或者,您也可以要求我们在扫描任务完成后 ping 您的回调端点,方法是添加 回调URL 在查询中。

邮寄 https://data.oxylabs.io/v1/queries

You need to post query parameters as data in the JSON body.

curl --user user:pass1\
'https://data.oxylabs.io/v1/queries' \
-H "Content-Type: application/json" \
-d '{"source": "universal", "url": "https://stackoverflow.com/questions/tagged/python", "callback_url": "https://your.callback.url", "storage_type": "s3", "storage_url": "YOUR_BUCKET_NAME"}'

The API will respond with query information in JSON format, by printing it in the response body, similar to this:

{
  "callback_url": "https://your.callback.url",
  "client_id": 5,
  "created_at": "2019-10-01 00:00:01",
  "domain": "com",
  "geo_location": null,
  "id": "12345678900987654321",
  "limit": 10,
  "locale": null,
  "pages": 1,
  "parse": false,
  "render": null,
  "url": "https://stackoverflow.com/questions/tagged/python",
  "source": "universal",
  "start_page": 1,
  "status": "pending",
  "storage_type": "s3",
  "storage_url": "YOUR_BUCKET_NAME/12345678900987654321.json",
  "subdomain": "www",
  "updated_at": "2019-10-01 00:00:01",
  "user_agent_type": "desktop",
  "_links": [
    {
      "rel": "self",
      "href": "http://data.oxylabs.io/v1/queries/12345678900987654321",
      "method": "GET"
    },
    {
      "rel": "results",
      "href": "http://data.oxylabs.io/v1/queries/12345678900987654321/results",
      "method": "GET"
    }
  ]
}

检查工作状态

If your query had a 回调URL, we will send you a message containing a link to the content once the scraping task is done. However, if there was no 回调URL in the query, you will need to check the job status yourself. For that, you need to use the URL in href 根据 rel:自我 在您向我们的 API 提交查询后收到的响应信息中。它应该与下面的内容相似: http://data.oxylabs.io/v1/queries/12345678900987654321.

GET https://data.oxylabs.io/v1/queries/{id}

Querying this link will return the job information, including its 地位. There are three possible 地位 价值观

未决 任务仍在队列中,尚未完成。
完成的 任务完成后,您可以通过在 href 根据 rel:成果 : http://data.oxylabs.io/v1/queries/12345678900987654321/results
有问题 There was an issue with the job, and we could not complete it, most likely due to a server error on the target site's side.
curl --user user:pass1 'http://data.oxylabs.io/v1/queries/12345678900987654321'

The API will respond with query information in JSON format, by printing it in the response body. Notice that job 地位 改为 完成的.现在您可以通过查询 http://data.oxylabs.io/v1/queries/12345678900987654321/results.

您还可以看到任务已被 updated_at 2019-10-01 00:00:15 - 查询需要 14 秒才能完成。

{
  "client_id": 5,
  "created_at": "2019-10-01 00:00:01",
  "domain": "com",
  "geo_location": null,
  "id": "12345678900987654321",
  "limit": 10,
  "locale": null,
  "pages": 1,
  "parse": false,
  "render": null,
  "url": "sofa",
  "source": "universal",
  "start_page": 1,
  "status": "done",
  "subdomain": "www",
  "updated_at": "2019-10-01 00:00:15",
  "user_agent_type": "desktop",
  "_links": [
    {
      "rel": "self",
      "href": "http://data.oxylabs.io/v1/queries/12345678900987654321",
      "method": "GET"
    },
    {
      "rel": "results",
      "href": "http://data.oxylabs.io/v1/queries/12345678900987654321/results",
      "method": "GET"
    }
  ]
}

检索工作内容

Once you know the job is ready to be retrieved by checking its status, you can GET it using the URL in href 根据 rel:成果 in our initial response. It should look similar to this: http://data.oxylabs.io/v1/queries/12345678900987654321/results.

GET https://data.oxylabs.io/v1/queries/{id}/results

通过设置 "任务状态",可以自动检索结果,而无需定期检查任务状态。 回调 服务。用户需要指定运行回调服务的服务器的 IP 或域。当我们的系统完成一项任务时,它将向所提供的 IP 或域发送一条信息,回调服务将下载结果,如 回调实现示例.

curl --user user:pass1 'http://data.oxylabs.io/v1/queries/12345678900987654321/results'

API 将返回工作内容:

{
  "results": [
    {
      "content": "<!doctype html>
        CONTENT      
      ",
      "created_at": "2019-10-01 00:00:01",
      "updated_at": "2019-10-01 00:00:15",
      "page": 1,
      "url": "https://stackoverflow.com/questions/tagged/python",
      "job_id": "12345678900987654321",
      "status_code": 200
    }
  ]
}

回调

回调是一个 职位 我们会向您的机器发送请求,告知数据提取任务已完成,并提供下载刮擦内容的 URL。这意味着您不再需要 检查工作状态 手动操作。一旦数据到齐,我们会通知您,您现在需要做的就是 取回.

# Please see the code samples in Python and PHP.

回调输出示例

{  
   "created_at":"2019-10-01 00:00:01",
   "updated_at":"2019-10-01 00:00:15",
   "locale":null,
   "client_id":163,
   "user_agent_type":"desktop",
   "source":"universal",
   "pages":1,
   "subdomain":"www",
   "status":"done",
   "start_page":1,
   "parse":0,
   "render":null,
   "priority":0,
   "ttl":0,
   "origin":"api",
   "persist":true,
   "id":"12345678900987654321",
   "callback_url":"http://your.callback.url/",
   "url":"https://stackoverflow.com/questions/tagged/python",
   "domain":"de",
   "limit":10,
   "geo_location":null,
   {...}
   "_links":[
      {  
         "href":"https://data.oxylabs.io/v1/queries/12345678900987654321",
         "method":"GET",
         "rel":"self"
      },
      {  
         "href":"https://data.oxylabs.io/v1/queries/12345678900987654321/results",
         "method":"GET",
         "rel":"results"
      }
   ],
}

批量查询

HTML Crawler API also supports executing multiple keywords, up to 1,000 keywords with each batch. The following endpoint will submit multiple keywords to the extraction queue.

邮寄 https://data.oxylabs.io/v1/queries/batch

You need to post query parameters as data in the JSON body.

The system will handle every keyword as a separate request. If you provided a callback URL, you will get a separate call for each keyword. Otherwise, our initial response will contain job 本我的所有关键字。例如,如果您发送了 50 个关键字,我们将返回 50 个唯一的职位。 本我s.

重要! 询问 是唯一一个可以有多个值的参数。所有其他参数对于该批次查询都是一样的。

curl --user user:pass1 'https://data.oxylabs.io/v1/queries/batch' -H 'Content-Type: application/json' \
 -d '@keywords.json'

keywords.json 内容:

{  
   "url":[  
      "https://stackoverflow.com/questions/tagged/python",
      "https://stackoverflow.com/questions/tagged/golang",
      "https://stackoverflow.com/questions/tagged/php"
   ],
   "source": "universal",
   "callback_url": "https://your.callback.url"
}

The API will respond with query information in JSON format, by printing it in the response body, similar to this:

{
  "queries": [
    {
      "callback_url": "https://your.callback.url",
      {...}
      "created_at": "2019-10-01 00:00:01",
      "domain": "com",
      "id": "12345678900987654321",
      {...}
      "url": "https://stackoverflow.com/questions/tagged/python",
      "source": "universal",
      {...}
          "rel": "results",
          "href": "http://data.oxylabs.io/v1/queries/12345678900987654321/results",
          "method": "GET"
        }
      ]
    },
    {
      "callback_url": "https://your.callback.url",
      {...}
      "created_at": "2019-10-01 00:00:01",
      "domain": "com",
      "id": "12345678901234567890",
      {...}
      "url": "https://stackoverflow.com/questions/tagged/golang",
      "source": "universal",
      {...}
          "rel": "results",
          "href": "http://data.oxylabs.io/v1/queries/12345678901234567890/results",
          "method": "GET"
        }
      ]
    },
    {
      "callback_url": "https://your.callback.url",
      {...}
      "created_at": "2019-10-01 00:00:01",
      "domain": "com",
      "id": "01234567899876543210",
      {...}
      "url": "https://stackoverflow.com/questions/tagged/php",
      "source": "universal",
      {...}
          "rel": "results",
          "href": "http://data.oxylabs.io/v1/queries/01234567899876543210/results",
          "method": "GET"
        }
      ]
    }
  ]
}

获取通知程序 IP 地址列表

您可能希望将向您发送回调信息的 IP 列入白名单,或为其他目的获取这些 IP 的列表。这可以通过 获取在这个端点上: https://data.oxylabs.io/v1/info/callbacker_ips.

curl --user user:pass1 'https://data.oxylabs.io/v1/info/callbacker_ips'

API 将返回向您的系统发出回调请求的 IP 列表:

{
    "ips":[
        "x.x.x.x"、
        "y.y.y.y"
    ]
}

上传到存储器

默认情况下,RTC 任务结果存储在我们的数据库中。这意味着您需要查询我们的结果端点并自行检索内容。自定义存储功能允许您将结果存储在自己的云存储中。该功能的优势在于,您无需为了获取结果而发出额外请求,所有内容都会直接存储到您的存储桶中。

我们支持亚马逊 S3 和谷歌云存储。如果您想使用其他类型的存储,请联系您的客户经理,讨论功能交付时间表。

亚马逊 S3

要将作业结果上传到 Amazon S3 存储桶,请为我们的服务设置访问权限。为此,请访问 https://s3.console.aws.amazon.com/ > S3 > 存储 > 桶名称(如果没有,请新建) > 权限 > 桶策略

Oxylabs HTML Crawler API Upload to Storage1

您可以在此找到水桶政策 JSON 或右侧的代码示例区。不要忘记在 您的邮筒名称.通过该策略,我们可以向您的邮筒写入内容,允许您访问上传的文件,并了解邮筒的位置。

谷歌云存储

要将作业结果上传到您的 Google Cloud Storage 存储桶,请为我们的服务设置特殊权限。为此,请使用 存储.对象.创建 权限并将其分配给 Oxylabs 服务帐户电子邮件 [email protected].

Oxylabs HTML Crawler API Upload to Storage2

Oxylabs HTML Crawler API Upload to Storage3

使用方法

要使用此功能,请在请求中指定两个附加参数。了解更多信息 这里.

上传路径如下 YOUR_BUCKET_NAME/job_ID.json.您可以在提交请求后从我们收到的回复正文中找到职位 ID。在 本例 工作编号为 12345678900987654321.

{
    "版本":"2012-10-17",
    "Id":"Policy1577442634787",
    "声明":[
        {
            "Sid":"Stmt1577442633719"、
            "效果":"允许"、
            "校长":{
                "AWS":"arn:aws:iam::324311890426:user/oxylabs.s3.uploader"
            },
            "Action":"s3:GetBucketLocation"、
            "资源":"arn:aws:s3:::YOUR_BUCKET_NAME" }.
        },
        {
            "Sid":"Stmt1577442633719"、
            "效果":"允许"、
            "校长":{
                "AWS":"arn:aws:iam::324311890426:user/oxylabs.s3.uploader"
            },
            "Action":[
                "s3:PutObject"、
                "s3:PutObjectAcl"。
            ],
            "资源":"arn:aws:s3:::YOUR_BUCKET_NAME/*"。
        }
    ]
}

实时

The data submission is the same as in Push-Pull method, but with Realtime, we will return the content on open connection. You send us a query, the connection remains open, we retrieve the content, and bring it to you. The endpoint that handles that is this:

邮寄 https://realtime.oxylabs.io/v1/queries

The timeout limit for open connections is 100 seconds. Therefore, in rare cases of heavy load, we may not be able to ensure the data gets to you.

You need to post query parameters as data in the JSON body. Please see an example for more details.

curl --user user:pass1 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" \
 -d '{"source": "universal", "url": "https://stackoverflow.com/questions/tagged/python"}'

打开连接时将返回的响应体示例:

{
  "results": [
    {
      "content": "
      CONTENT
      "
      "created_at": "2019-10-01 00:00:01",
      "updated_at": "2019-10-01 00:00:15",
      "id": null,
      "page": 1,
      "url": "https://stackoverflow.com/questions/tagged/python",
      "job_id": "12345678900987654321",
      "status_code": 200
    }
  ]
}

超级用户接口

If you have ever used regular proxies for data scraping, integrating SuperAPI delivery method will be a breeze. You simply need to use our entry node as proxy, authorize with HTML Crawler API credentials, and ignore certificates. In cURL 这是 -k 或 --不安全.您的数据将通过开放连接发送给您。

GET realtime.oxylabs.io:60000

超级用户接口只支持少量参数,因为它 仅适用于 直接 数据源 where a full URL is provided. These parameters should be sent as headers. This is a list of accepted parameters:

X-OxySERPs-User-Agent-Type 虽然无法指明特定的 User-Agent,但您可以让我们知道您使用的浏览器和平台。支持的用户代理列表如下所示 这里.

If you need help setting up SuperAPI, get in touch with us at [email protected].

curl -k \
-x realtime.oxylabs.io:60000 \
-U user:pass1 \
-H "X-OxySERPs-User-Agent-Type: desktop_chrome" \
"https://stackoverflow.com/questions/tagged/python"

内容类型

HTML Crawler API returns 原始HTML.


Download Images

It is possible to download images via HTML Crawler API. If you are doing that through SuperAPI, you can simply save the output to image extension. For example:

curl -k -x realtime.oxylabs.io:60000 -U user:pass1 "https://example.com/image.jpg" >> image.jpg

If you are using 推拉式 或 实时 methods, you will need to add content_encoding parameter with a value of base64. Once you receive the results, you then need to decode encoded data from content into bytes and save it as an image file. Please find an example in Python on the right.


数据来源

HTML Crawler API accepts URLs, along with additional parameters, such as User-Agent type, proxy location, and others. See this method, which we call 直接, described below.

HTML Crawler API is able to render JavaScript when scraping. This enables you to get more data from the web page and get screenshots.

If you are unsure about any part of the documentation, drop us a line at [email protected] 或联系您的客户经理。


直接

Oxylabs HTML Crawler API Direct

universal source is designed to retrieve the contents of any URL on the internet. 职位-ing the parameters in JSON format to the following endpoint will submit the specified URL to the extraction queue.

查询参数

参数 说明 默认值
消息来源 数据来源 universal
网址 Direct URL (link) to Universal page
用户代理类型 设备类型和浏览器。完整列表如下 这里。 桌面
地理位置 Geo location of proxy used to retrieve the data. The full list of supported locations can be found 这里。
地点 Locale, as expected in Accept-Language header.
给予 Enables JavaScript rendering. Use it when the target requires JavaScript to load content. Only works via Push-Pull (a.k.a. Callback) method. There are two available values for this parameter: html(get raw output) and png (get a Base64-encoded screenshot).
content_encoding Add this parameter if you are downloading images. Learn more 这里。 base64
context: Base64-encoded POST request body. It is only useful if http_method is set to post.
content
context: Pass your own cookies.
cookies
context: Indicate whether you would like the scraper to follow redirects (3xx responses with a destination URL) to get the contents of the URL at the end of the redirect chain.
follow_redirects
context: Pass your own headers.
页眉
context: Set it to post if you would like to make a POST request to your target URL via Universal scraper. 获取
http_method
context: If you want to use the same proxy with multiple requests, you can do so by using this parameter. Just set your session to any string you like, and we will assign a proxy to this ID and keep it for up to 10 minutes. After that, if you make another request with the same session ID, a new proxy will be assigned to that particular session ID.
session_id
context: Define a custom HTTP response code (or a few of them), upon which we should consider the scrape successful and return the content to you. May be useful if you want us to return the 503 error page or in some other non-standard cases.
successful_status_codes
回调URL URL to your callback endpoint.
存储类型 存储服务提供商。我们支持 Amazon S3 和 Google Cloud Storage。这些存储服务提供商的 storage_type 参数值分别为 s3 和 gcs。完整的实现可以在 上传到存储器 页。此功能只能通过推拉(回调)方法使用。
存储URL 您的存储桶名称。仅适用于推挽(回调)方法。
   - 所需参数

In this example, the API will retrieve a universal product page in Push-Pull method. All available parameters are included (though not always necessary or compatible within the same request), to give you an idea on how to format your requests:

curl --user user:pass1 \
'https://data.oxylabs.io/v1/queries' \
-H "Content-Type: application/json" \
 -d '{"source":"universal","url":"https://stackoverflow.com/questions/tagged/python","user_agent_type":"mobile","context":[{"key":"headers","value":{"Accept-Language":"en-US","Content-Type":"application/octet-stream","Custom-Header":"custom header content"}},{"key":"cookies","value":[{"key":"NID","value":"1234567890"},{"key":"1P JAR","value":"0987654321"}]},{"key":"follow_redirects","value":true},{"key":"http_method","value":"post"},{"key":"content","value":"YmFzZTY0RW5jb2RlZFBPU1RCb2R5"},{"key":"successful_status_codes","value":[808,909]}]}

以下是实时模式下的相同示例:

curl --user user:pass1 \
'https://data.oxylabs.io/v1/queries' \
-H "Content-Type: application/json" \
-d '{"source": "universal", "url": "https://stackoverflow.com/questions/tagged/python", "user_agent_type": "mobile", "context": [{"key": "headers", "value": ["Accept-Language": "en-US", "Content-Type": "application/octet-stream", "Custom-Header": "custom header content"]}, {"key": "cookies", "value": [{"key": "NID", "value": "1234567890"}, {"key": "1P JAR", "value": "0987654321"}, {"key": "follow_redirects", "value": true}, {"key": "http_method", "value": "post"}, {"key": "content", "value": "base64EncodedPOSTBody"}, {"key": "successful_status_codes", "value": [303, 808, 909]}]}]}'

并通过超级用户接口(SuperAPI):

# A GET request could look something like this:
curl -k \
-x http://realtime.oxylabs.io:60000 \
-U user:pass1 \
"https://stackoverflow.com/questions/tagged/python" \
-H "X-OxySERPs-Session-Id: 1234567890abcdef" \
-H "X-OxySERPs-Geo-Location: India" \
-H "Accept-Language: en-US" \
-H "Content-Type: application/octet-stream" \
-H "Custom-Header: custom header content" \
-H "Cookie: NID=1234567890; 1P_JAR=0987654321" \
-H "X-Status-Code: 303, 808, 909"

# A POST request would have the same structure but contain a parameter specifying that it is a POST request:
curl -X POST \
-k \
-x http://realtime.oxylabs.io:60000 \
-U user:pass1 "https://stackoverflow.com/questions/tagged/python" \
-H "X-OxySERPs-Session-Id: 1234567890abcdef" \
-H "X-OxySERPs-Geo-Location: India" \
-H "Custom-Header: custom header content" \
-H "Cookie: NID=1234567890; 1P_JAR=0987654321" \
-H "X-Status-Code: 303, 808, 909"

参数值

Geo_Location

Full list of supported geo locations can be found in CSV format 这里.

"United Arab Emirates",
"Albania",
"Armenia",
"Angola",
"Argentina",
"Australia",
...
"Uruguay",
"Uzbekistan",
"Venezuela Bolivarian Republic of",
"Viet Nam",
"South Africa",
"Zimbabwe"

HTTP_Method

Universal Crawler supports two HTTP(S) methods: 获取 (default) and 职位.

"GET",
"POST"

Render

Universal Crawler can render Javascript and return either a rendered HTML document or a PNG screenshot of the web page.

"html",
"png"

User_Agent_Type

下载完整列表 用户代理类型 JSON 中的值 这里.

[
  {
    "user_agent_type":"桌面"、
    "描述":"随机桌面浏览器用户代理"
  },
  {
    "user_agent_type":"desktop_firefox"、
    "描述":"最新版桌面火狐浏览器的随机用户代理"。
  },
  {
    "user_agent_type":"desktop_chrome"、
    "description":"最新版桌面 Chrome 浏览器的随机用户代理"。
  },
  {
    "user_agent_type":"desktop_opera"、
    "description":"最新版本桌面 Opera 的随机用户代理"。
  },
  {
    "user_agent_type":"desktop_edge"、
    "description":"桌面边缘最新版本之一的随机用户代理"。
  },
  {
    "user_agent_type":"desktop_safari"、
    "description":"桌面 Safari 最新版本之一的随机用户代理"。
  },
  {
    "user_agent_type":"mobile"、
    "description":"随机移动浏览器用户代理"
  },
  {
    "user_agent_type":"mobile_android"、
    "description"(描述):"最新版本安卓浏览器的随机用户代理"。
  },
  {
    "user_agent_type":"mobile_ios"、
    "描述":"最新版本 iPhone 浏览器的随机用户代理"。
  },
  {
    "user_agent_type":"平板电脑"、
    "描述":"随机平板电脑浏览器用户代理"
  },
  {
    "user_agent_type":"tablet_android"、
    "描述":"最新版本安卓平板电脑的随机用户代理"。
  },
  {
    "user_agent_type":"tablet_ios"、
    "description":"最新版本 iPad 平板电脑的随机用户代理"。
  }
]

账户状态

使用统计

您可以通过查询以下端点找到您的使用统计数据:

GET https://data.oxylabs.io/v1/stats

By default, the API will return all-time usage statistics. Adding group_by=month 将返回月度统计数据,而 group_by=day 将返回每日数字。

This query will return all-time statistics. You can find your daily and monthly usage by adding either group_by=day 或 group_by=month

curl --user user:pass1 'https://data.oxylabs.io/v1/stats'

输出示例

{
    "data": {
        "sources": [
            {
                "realtime_results_count": "90",
                "results_count": "10",
                "title": "universal"
            }
        ]
    },
    "meta": {
        "group_by": null
    }
}

限制

The following endpoint will give your monthly commitment information as well as how much of it has already been used:

GET https://data.oxylabs.io/v1/stats/limits
curl --user user:pass1 'https://data.oxylabs.io/v1/stats/limits'

输出示例

{
    "monthly_requests_commitment":4500000,
    "used_requests":985000
}

响应代码

代码 现状 说明
204 无内容 您正在尝试检索一项尚未完成的任务。
400 多种错误信息 Bad request structure, could be a misspelled parameter or invalid value. The response body will have a more specific error message.
401 未提供授权标头"/"授权标头无效"/"未找到客户端 缺少授权标头或登录凭证不正确。
403 禁止 您的帐户无法访问此资源。
404 未找到 您要查找的职位编号已不再可用。
429 请求太多 超出费率限制。请联系您的客户经理以提高限额。
500 未知错误 无法提供服务。
524 超时 无法提供服务。
612 未定义的内部错误 Something went wrong and we failed the job you submitted. You can try again at no extra cost, as we do not charge you for 有问题 jobs. If that does not work, please get in touch with us.
613 重试次数过多后出现故障 We tried scraping the job you submitted, but gave up after reaching our retry limit. You can try again at no extra cost, as we do not charge you for 有问题 jobs. If that does not work, please get in touch with us.

参考资料

 


免责声明 这部分内容主要来自商家。如果商家不希望在我的网站上显示,请 联系我们 删除您的内容。

最后更新于 5 月 16, 2022

您推荐代理服务吗?

点击奖杯即可颁奖!

平均评分 5 /5.计票: 1

目前没有投票!成为第一个给本帖评分的人。

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注

滚动到顶部