how to use Zenscrape API

如何使用 Zenscrape API

Want to use Zenscrape API for web scraping? So you have to know how to use Zenscrape API. This article will give you a detailed guide. Let's go!

Documentation

Pro TipRegister your free apikey here and all code snippets below will contain your private apikey. If you have already registered, login before viewing the documentation.

Postman Collection

To provide you with the best developer experience possible, we also created a Postman collection covering all of our endpoints, including plenty of examples. Run in Postman.


Credit Costs & Failed Requests

The number of credits that is counted towards your quota depends on the type request configuration that and the status code that the API endpoint returns. Hence, a request can cost between 1 and 25 credits. You can configure your request with our request builder inside your dashboard. It generates code snippets for the most common programming languages. You can find the list of our error codes 这里.

premium render Cost in credits
false false 1
false 5
false 10
25

Basic Usage

This endpoint allows you to fetch the content of a website. For basic usage, only one parameter is required in addition to your apikey.

GET POST /get

Zenscrape adding the url parameter to your request will fetch the HTML content from the target website. This request configuration will use standard proxies and will count as 1 credit towards your monthly limit.

卷曲 "https://app.zenscrape.com/api/v1/get?url=http://httpbin.org/ip" \
    -H "apikey: YOUR-APIKEY" 

will generate the following response:

<html>
    <head>
    </head>
    <body>
        <pre style="word-wrap: break-word; white-space: pre-wrap;">
            {
                "origin": "80.102.66.13"
            }
        </pre>
    </body>
</html>

Web Scraping API

GET POST /get

See Demo Response:

<html> 
   <head></head> 
     <body> 
       <pre> 
          { 
            "origin": "223.233.44.142" 
          } 
      </pre> 
    </body> 
</html>

This endpoint accepts the following parameters:

Parameter 类型 Description
url required target site you want to scrape
premium optional, boolean, counts as 20 credits towards your quota Uses residential proxies, unlocks sites that are hard to scrape
location optional, default: worldwide 如果 premium=false possible locations are ‘na' (North America) and ‘eu' (Europe)
如果 premium=true you can choose a location from our list of 230+ countries
keep_headers optional, boolean Allows to pass through forward headers (e.g. user agents, cookies)
device_type optional, boolean By default, a desktop user agent is set. When set to ‘mobile', it will be set to an iPhone or Android user agent
render optional, boolean, counts as 5 credits towards your quota Use a headless browser to fetch content that relies on javascript
wait_for optional, integer Max value: 15, only works together with render=true amount of seconds that a browser waits for content to render before it scrapes the HTML markup
wait_for_css optional, integer Only works together with render=true, waits until the css-selector becomes visible
session optional, string a random string if you want to reuse an IP, for example session=kdQ1VeQE
scroll_to_bottom optional, boolean Only works together with render=true, scrolls to bottom of page before returning the page content

Zenscrape is a REST-API and accepts HTTP requests through any programming language. The following example connects to the url https://httpbin.org/ip through a proxy and renders the content inside a browser, before it returns the markup to you.

BrowsercURLPython Node.jsPHP
https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fip&premium=true&country=de&render=true
卷曲 "https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fip&premium=true&country=de&render=true"
import requests headers = { "apikey": "YOUR-APIKEY"} params = ( ("url","https://httpbin.org/ip"), ("premium","true"), ("country","de"), ("render","true"), ); response = requests.get('https://app.zenscrape.com/api/v1/get', headers=headers, params=params); print(response.text)
变异 request = require('request'); 变异 options = { url: 'https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https://httpbin.org/ip&premium=true&country=de&render=true' }; function callback(error, response, body) { 如果 (!error && response.statusCode == 200) { console.log(body); } } request(options, callback);
$ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, ); curl_setopt($ch, CURLOPT_HEADER, false); $data = [ "url" => "https://httpbin.org/ip", "premium" => "true", "country" => "de", "render" => "true", ]; curl_setopt($ch, CURLOPT_URL, "https://app.zenscrape.com/api/v1/get?" . http_build_query($data)); curl_setopt($ch, CURLOPT_HTTPHEADER, array( "内容类型:应用程序/json", "apikey: YOUR-APIKEY", )); $response = curl_exec($ch); curl_close($ch); $json = json_decode($response); var_dump($json);

Proxy Mode

GET POST 

See Demo Response:

will generate the following response:

<html> 
   <head></head> 
     <body> 
       <pre> 
          { 
            "origin": "223.233.44.142" 
          } 
      </pre> 
    </body> </html>

In addition to the REST API Zenscrape also provides an HTTP proxy interface. You can integrate any application that already relies on proxies. Simply use your API key as the username and use any parameters you usually supply as the password.

The HTTP proxy will return HTTP/1.1 407 Proxy Authentication Required in case your credentials are invalid.

cURLPythonPHP
curl -k -x "http://YOUR-APIKEY:[email protected]:8282" https://quotes.toscrape.com/js
import requests proxy = { "http": "http://YOUR-APIKEY:[email protected]:8282", "https": "http://YOUR-APIKEY:[email protected]:8282" } response = requests.get('https://quotes.toscrape.com/js', proxies=proxy, verify=False); print(response.text)
$ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, ); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_URL, "https://quotes.toscrape.com/js"); curl_setopt($ch, CURLOPT_PROXY, "proxy-server.zenscrape.com:8282"); curl_setopt($ch, CURLOPT_PROXYUSERPWD, "YOUR-APIKEY:render=true&wait_for_css=.author"); $response = curl_exec($ch); curl_close($ch); var_dump($response);

Premium Proxy Location List

The following list of locations can be used for the location parameter, if premium is set to .

North America
  • ai – Anguilla
  • ag – Antigua and Barbuda
  • aw – Aruba
  • bs – Bahamas
  • bb – Barbados
  • bz – Belize
  • bm – Bermuda
  • bq – Bonaire Saint Eustatius and Saba
  • vg – BritishVirginIslands
  • ca – Canada
  • ky – Cayman Islands
  • cr – Costa Rica
  • cu – Cuba
  • cw – Curacao
  • dm – Dominica
  • do – Dominican Republic
  • sv – El Salvador
  • gd – Grenada
  • gp – Guadeloupe
  • gt – Guatemala
  • ht – Haiti
  • hn – Honduras
  • jm – Jamaica
  • mq – Martinique
  • mx – Mexico
  • ni – Nicaragua
  • pa – Panama
  • pr – Puerto Rico
  • bl – Saint Barthelemy
  • kn – Saint Kitts and Nevis
  • lc – Saint Lucia
  • mf – Saint Martin
  • pm – Saint Pierre and Miquelon
  • vc – Saint Vincent and the Grenadines
  • sx – Sint Maarten
  • tt – Trinidad and Tobago
  • tc – TurksandCaicosIslands
  • us – United States
  • vi – United States Virgin Islands
Europe
  • ax – Aland Islands
  • al – Albania
  • ad – Andorra
  • at – Austria
  • by – Belarus
  • be – Belgium
  • ba – Bosnia and Herzegovina
  • bg – Bulgaria
  • hr – Croatia
  • cy – Cyprus
  • cz – Czech Republic
  • dk – Denmark
  • ee – Estonia
  • fo – Faroe Islands
  • fi – Finland
  • fr – France
  • de – Germany
  • gi – Gibraltar
  • gr – Greece
  • gl – Greenland
  • gg – Guernsey
  • hu – Hungary
  • is – Iceland
  • ie – Ireland
  • im – Isle of Man
  • it – Italy
  • je – Jersey
  • xk – Kosovo
  • lv – Latvia
  • li – Liechtenstein
  • lt – Lithuania
  • lu – Luxembourg
  • mk – Macedonia
  • mt – Malta
  • md – Moldova
  • mc – Monaco
  • me – Montenegro
  • nl – Netherlands
  • no – Norway
  • pl – Poland
  • pt – Portugal
  • ro – Romania
  • sm – San Marino
  • rs – Serbia
  • sk – Slovakia
  • si – Slovenia
  • es – Spain
  • se – Sweden
  • ch – Switzerland
  • tr – Turkey
  • ua – Ukraine
  • uk – United Kingdom
  • va – Vatican
Asia
  • af – Afghanistan
  • am – Armenia
  • az – Azerbaijan
  • bh – Bahrain
  • bd – Bangladesh
  • bt – Bhutan
  • bn – Brunei
  • kh – Cambodia
  • cn – China
  • ge – Georgia
  • hk – Hong Kong
  • in – India
  • mm – Myanmar
  • np – Nepal
  • om – Oman
  • pk – Pakistan
  • ph – Philippines
  • id – Indonesia
  • ir – Iran
  • iq – Iraq
  • il – Israel
  • jp – Japan
  • jo – Jordan
  • kz – Kazakhstan
  • kw – Kuwait
  • kg – Kyrgyzstan
  • la – Laos
  • lb – Lebanon
  • mo – Macao
  • my – Malaysia
  • mv – Maldives
  • mn – Mongolia
  • qa – Qatar
  • ru – Russia
  • sa – Saudi Arabia
  • sg – Singapore
  • kr – South Korea
  • lk – Sri Lanka
  • sy – Syria
  • tw – Taiwan
  • tj – Tajikistan
  • th – Thailand
  • tm – Turkmenistan
  • ae – United Arab Emirates
  • uz – Uzbekistan
  • vn – Vietnam
  • ye – Yemen
South America
  • ar – Argentina
  • bo – Bolivia
  • br – Brazil
  • cl – Chile
  • co – Colombia
  • ms – Montserrat
  • ec – Ecuador
  • fk – Falkland Islands
  • gf – French Guiana
  • gy – Guyana
  • py – Paraguay
  • pe – Peru
  • sr – Surinam
  • uy – Uruguay
  • ve – Venezuela
Australia & Oceanica
  • au – Australia
  • cx – Christmas Island
  • cc – Cocos Islands
  • ck – Cook Islands
  • tl – East Timor
  • fj – Fiji
  • pf – FrenchPolynesia
  • gu – Guam
  • ki – Kiribati
  • mh – Marshall Islands
  • fm – Micronesia
  • nr – Nauru
  • nc – New Caledonia
  • nz – New Zealand
  • mp – Northern Mariana Islands
  • pw – Palau
  • pg – Papua New Guinea
  • ws – Samoa
  • sb – SolomonIslands
  • to – Tonga
  • tv – Tuvalu
  • vu – Vanuatu
  • vf – Wallis and Futuna
Africa
  • dz – Algeria
  • ao – Angola
  • bj – Benin
  • bw – Botswana
  • bf – Burkina Faso
  • bi – Burundi
  • cm – Cameroon
  • cv – CapeV erde
  • cf – Central African Republic
  • td – Chad
  • km – Comoros
  • dj – Djibouti
  • eg – Egypt
  • gq – Equatorial Guinea
  • er – Eritrea
  • et – Ethiopia
  • ga – Gabon
  • gm – Gambia
  • gh – Ghana
  • gn – Guinea
  • gw – Guinea
  • ci – Ivory Coast
  • ke – Kenya
  • ls – Lesotho
  • lr – Liberia
  • ly – Libya
  • mg – Madagascar
  • mw – Malawi
  • ml – Mali
  • mr – Mauritania
  • mu – Mauritius
  • yt – Mayotte
  • ma – Morocco
  • mz – Mozambique
  • na – Namibia
  • ne – Niger
  • ng – Nigeria
  • cg – Republic of the Congo
  • re – Reunion
  • rw – Rwanda
  • st – Sao Tomeand Principe
  • sn – Senegal
  • sc – Seychelles
  • sl – Sierra Leone
  • so – Somalia
  • za – South Africa
  • ss – South Sudan
  • sd – Sudan
  • sz – Swaziland
  • tz – Tanzania
  • tg – Togo
  • tn – Tunisia
  • ug – Uganda
  • eh – Western Sahara
  • zm – Zambia
  • zw – Zimbabwe

Authentification & Apikey Information

Zenscrape uses API keys to allow access to the API. You can register a new API key at our developer portal. You can register a new API key at our developer portal. The /status route returns the number of left credits.

To authorize, you can use the following ways:

GET POST /status

Zenscrape looks for the API key in a header that looks like the following (recommended, works with all requests):

卷曲 "https://app.zenscrape.com/api/v1/status" \ -H "apikey: YOUR-APIKEY"

卷曲 "https://app.zenscrape.com/api/v1/status?apikey=YOUR-APIKEY"

卷曲 "https://app.zenscrape.com/api/v1/status -F "apikey=YOUR-APIKEY"

Error Codes

The Zenscrape API uses the following error codes:

HTTP Error Code Meaning
403 Forbidden — API key is wrong, you don't have enough credits or you don't have enough rights to access it.
404 Not Found — There were no results found.
429 Too many requests — You have reached the limit for concurrency. Please wait or upgrade
500 Internal Server Error

The API returns errors in this template:

{ 
   "errors": [{ 
      "url": "missing" 
   }] 
}

Common Use Cases

Using Premium Proxies

Zenscrape offers a large pool of premium proxies are the preferred choice when scraping sites that are difficult to scrape. In order to utilize the pool simply set premium=true. In addition, you may specify a location, using the location parameter. We have chosen ‘se' (Sweden) for this example. You can see a list of all supported locations 这里.

BrowsercURLPython Node.jsPHP
https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fip&premium=true&country=se
卷曲 "https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fip&premium=true&country=se"
import requests headers = { "apikey": "YOUR-APIKEY"} params = ( ("url","https://httpbin.org/ip"), ("premium","true"), ("country","se"), ); response = requests.get('https://app.zenscrape.com/api/v1/get', headers=headers, params=params); print(response.text)
变异 request = require('request'); 变异 options = { url: 'https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https://httpbin.org/ip&premium=true&country=se' }; function callback(error, response, body) { 如果 (!error && response.statusCode == 200) { console.log(body); } } request(options, callback);
$ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, ); curl_setopt($ch, CURLOPT_HEADER, false); $data = [ "url" => "https://httpbin.org/ip", "premium" => "true", "country" => "se", ]; curl_setopt($ch, CURLOPT_URL, "https://app.zenscrape.com/api/v1/get?" . http_build_query($data)); curl_setopt($ch, CURLOPT_HTTPHEADER, array( "内容类型:应用程序/json", "apikey: YOUR-APIKEY", )); $response = curl_exec($ch); curl_close($ch); $json = json_decode($response); var_dump($json);

Setting a Custom Header

Setting a custom header to avoid being blocked is not necessary, since we manage headers on our end. If you still want to set a custom header, you can do so by setting keep_headers=true. In this example we set a custom user-agent.

BrowsercURLPython Node.jsPHP
https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fheaders&keep_headers=true&country=us
curl -H "User-Agent: 123" \ "https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fheaders&keep_headers=true&country=us"
import requests headers = { "apikey": "YOUR-APIKEY", "User-Agent": "123" } params = ( ("url","https://httpbin.org/headers"), ("keep_headers","true"), ("country","us"), ); response = requests.get('https://app.zenscrape.com/api/v1/get', headers=headers, params=params); print(response.text)
变异 request = require('request'); 变异 headers = { 'User-Agent': '123' }; 变异 options = { url: 'https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https://httpbin.org/headers&keep_headers=true&country=us', headers: headers }; function callback(error, response, body) { 如果 (!error && response.statusCode == 200) { console.log(body); } } request(options, callback);
$ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, ); curl_setopt($ch, CURLOPT_HEADER, false); $data = [ "url" => "https://httpbin.org/headers", "keep_headers" => "true", "country" => "us", ]; curl_setopt($ch, CURLOPT_URL, "https://app.zenscrape.com/api/v1/get?" . http_build_query($data)); curl_setopt($ch, CURLOPT_HTTPHEADER, array( "内容类型:应用程序/json", "apikey: YOUR-APIKEY", "User-Agent: 123" )); $response = curl_exec($ch); curl_close($ch); $json = json_decode($response); var_dump($json);

Enable JS Rendering

A lot of websites use front-end frameworks like vue, react etc. In order to extract components that require javascript, please set render=true.

BrowsercURLPython Node.jsPHP
https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fheaders&keep_headers=true&country=us
curl -H "User-Agent: 123" \ "https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fhttpbin.org%2Fheaders&keep_headers=true&country=us"
import requests headers = { "apikey": "YOUR-APIKEY", "User-Agent": "123" } params = ( ("url","https://httpbin.org/headers"), ("keep_headers","true"), ("country","us"), ); response = requests.get('https://app.zenscrape.com/api/v1/get', headers=headers, params=params); print(response.text)
变异 request = require('request'); 变异 headers = { 'User-Agent': '123' }; 变异 options = { url: 'https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https://httpbin.org/headers&keep_headers=true&country=us', headers: headers }; function callback(error, response, body) { 如果 (!error && response.statusCode == 200) { console.log(body); } } request(options, callback);
$ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, ); curl_setopt($ch, CURLOPT_HEADER, false); $data = [ "url" => "https://httpbin.org/headers", "keep_headers" => "true", "country" => "us", ]; curl_setopt($ch, CURLOPT_URL, "https://app.zenscrape.com/api/v1/get?" . http_build_query($data)); curl_setopt($ch, CURLOPT_HTTPHEADER, array( "内容类型:应用程序/json", "apikey: YOUR-APIKEY", "User-Agent: 123" )); $response = curl_exec($ch); curl_close($ch); $json = json_decode($response); var_dump($json);

Getting around Cloudflare DDoS Protection

Quite a few websites that are offering interesting content have imposed cloudflare DDoS protection. Zenscrape automatically detects when cloudlare DDoS protection appears and returns the page content once the protection layer has disappeared. Hence, cloudflare DDoS protection is handled automatically and does not require any action from your end.


Blocking Particular Resources

In order to increase speed or to supress a certain page behaviour, it can be useful to block certain resources from loading. In the following example we have decided to block stylesheetsimage and other media from loading. Keep in mind that block_resources only works in combination with render=true.

BrowsercURLPython Node.jsPHP
https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fquotes.toscrape.com%2Fjs&render=true&block_resources=stylesheet%2Cimage%2Cmedia
卷曲 "https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fquotes.toscrape.com%2Fjs&render=true&block_resources=stylesheet%2Cimage%2Cmedia"
import requests headers = { "apikey": "YOUR-APIKEY"} params = ( ("url","https://quotes.toscrape.com/js"), ("render","true"), ("block_resources","stylesheet,image,media"), ); response = requests.get('https://app.zenscrape.com/api/v1/get', headers=headers, params=params); print(response.text)
变异 request = require('request'); 变异 options = { url: 'https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https://quotes.toscrape.com/js&render=true&block_resources=stylesheet,image,media' }; function callback(error, response, body) { 如果 (!error && response.statusCode == 200) { console.log(body); } } request(options, callback);
$ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, ); curl_setopt($ch, CURLOPT_HEADER, false); $data = [ "url" => "https://quotes.toscrape.com/js", "render" => "true", "block_resources" => "stylesheet,image,media", ]; curl_setopt($ch, CURLOPT_URL, "https://app.zenscrape.com/api/v1/get?" . http_build_query($data)); curl_setopt($ch, CURLOPT_HTTPHEADER, array( "内容类型:应用程序/json", "apikey: YOUR-APIKEY", )); $response = curl_exec($ch); curl_close($ch); $json = json_decode($response); var_dump($json);

Cookies can also be passed to the request using keep_headers=true. The header then simply needs to contain the cookie name and value.

BrowsercURLPython Node.jsPHP
https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fquotes.toscrape.com%2Fcookies&keep_headers=true
curl -H "Cookie: SESSIONID=27382738" \ "https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https%3A%2F%2Fquotes.toscrape.com%2Fcookies&keep_headers=true"
import requests headers = { "apikey": "YOUR-APIKEY", "Cookie": "SESSIONID=27382738" } params = ( ("url","https://quotes.toscrape.com/cookies"), ("keep_headers","true"), ); response = requests.get('https://app.zenscrape.com/api/v1/get', headers=headers, params=params); print(response.text)
变异 request = require('request');, 变异 headers = { 'User-Agent': '123', 'Cookie': 'SESSIONID=27382738' }; 变异 options = { url: 'https://app.zenscrape.com/api/v1/get?apikey=YOUR-APIKEY&url=https://quotes.toscrape.com/cookies&keep_headers=true' }; function callback(error, response, body) { 如果 (!error && response.statusCode == 200) { console.log(body); } } request(options, callback);
$ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, ); curl_setopt($ch, CURLOPT_HEADER, false); $data = [ "url" => "https://quotes.toscrape.com/cookies", "keep_headers" => "true", ]; curl_setopt($ch, CURLOPT_URL, "https://app.zenscrape.com/api/v1/get?" . http_build_query($data)); curl_setopt($ch, CURLOPT_HTTPHEADER, array( "内容类型:应用程序/json", "apikey: YOUR-APIKEY",, "Cookie: SESSIONID=27382738" )); $response = curl_exec($ch); curl_close($ch); $json = json_decode($response); var_dump($json);

参考资料


免责声明:这部分内容主要来自商家。如果商家不希望在我的网站上显示,请 联系我们 删除您的内容。

最后更新于 5 月 15, 2022

您推荐代理服务吗?

点击奖杯即可颁奖!

平均评分 0 /5.计票: 0

目前没有投票!成为第一个给本帖评分的人。

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注

zh_CNChinese
滚动到顶部