Scraper API - Proxy API for Web Scraping in 2024

5 min read
Last updated: Oct 13, 2024

Scraper API provides a proxy service designed for web scraping. With over 20 million residential IPs across 12 countries, as well as software that can handle JavaScript rendering and solving CAPTCHAs, you can quickly complete large scraping jobs without ever having to worry about being blocked by any servers.

Implementation is extremely simple, and they offer unlimited bandwidth. Proxies are automatically rotated, but users can choose to maintain sessions if required. All you need to do is call the API with the URL that you want to scrape, and it will return the raw HTML. With Scraper API, you just focus on parsing the data, and they’ll handle the rest.

As per data, they have handled 5 billion API requests per month for over 1,500 businesses and developers around the world

Having built many web scrapers, we repeatedly went through the tiresome process of finding proxies, setting up headless browsers, and handling CAPTCHAs. That's why we decided to start Scraper API, it handles all of this for you so you can scrape any page with a simple API call!

— ScrapperAPI Story

One of the most frustrating parts of automated web scraping is constantly dealing with IP blocks and CAPTCHAs. ScrapperAPI handles it beautifully. You can customize request headers, request type, IP geo-location and more. They automatically prune slow proxies from our pools periodically, and guarantee unlimited bandwidth with speeds up to 100Mb/s, perfect for writing speedy web crawlers.

Features loaded:

  • Over 20 million residential IPs in the pool
  • Simple dashboard to manage usage and billing
  • Geo-targeting: target 12+ countries around the world
  • Free plan with 1000 requests & all features
  • Seven-day, no questions asked refund policy
  • 24/7 support and great customer service
  • Rotating and sticky IP sessions
  • Easy setup
  • Able to render JavaScript pages
  • Custom browser headers
  • Premium proxy pools
  • Auto-extraction of data from popular sites

Implementation

When you sign up for Scraper API you are given an access key. All you need to do is call the API with your key and the URL that you want to scrape, and you will receive the raw HTML of the page as a result. It’s as simple as:

curl "https://api.scraperapi.com?api_key=XYZ&url=https://httpbin.org/ip"

On the back end, when Scraper API receives your request, their service accesses the URL via one of their proxy servers, gets the data, and then sends it back to you.

Basic Usage

Scraper API exposes a single API endpoint, simply send a GET request to https://api.scraperapi.com with two query string parameters, api_key which contains your API key, and url which contains the url you would like to scrape.

/* Node.Js */
const scraperapiClient = require("scraperapi-sdk")("XYZ");
const response = await scraperapiClient.get("https://httpbin.org/ip");
logger.info(response);
/* JAVA */
// remember to install the library: https://search.maven.org/artifact/com.scraperapi/sdk/1.0
import com.scraperapi
ScraperApiClient client = new ScraperApiClient("XYZ");
  client.get("https://httpbin.org/ip")
  .result();

Result

<html>
  <head> </head>
  <body>
    <pre style="word-wrap: break-word; white-space: pre-wrap;">
      {"origin":"176.12.80.34"}
    </pre>
  </body>
</html>

Geographic Location

To ensure your requests come from the United States, please use the countrycode= flag (e.g. countrycode=us)

    curl "https://api.scraperapi.com/?api_key=XYZ&url=https://httpbin.org/ip&country_code=us"

POST/PUT Requests

Some advanced users will want to issue POST/PUT Requests in order to scrape forms and API endpoints directly.

# Replace POST with PUT to send a PUT request instead
curl -d 'foo=bar' \
-X POST \
"https://api.scraperapi.com/?api_key=XYZ&url=https://httpbin.org/anything"

# For form data
curl -H 'Content-Type: application/x-www-form-urlencoded' \
-F 'foo=bar' \
-X POST \
"https://api.scraperapi.com/?api_key=XYZ&url=https://httpbin.org/anything"

Result

{
  "args": {},
  "data": "{\"foo\":\"bar\"}",
  "files": {},
  "form": {},
  "headers": {
    "Accept": "application/json",
    "Accept-Encoding": "gzip, deflate",
    "Content-Length": "13",
    "Content-Type": "application/json; charset=utf-8",
    "Host": "httpbin.org",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
  },
  "json": {
    "foo": "bar"
  },
  "method": "POST",
  "origin": "191.101.82.154, 191.101.82.154",
  "url": "https://httpbin.org/anything"
}

Account Information

When you log into your Scraper API account, you will be presented with a dashboard that will show you how many requests you have used, how many requests you have left for the month, and the number of failed requests (which do not count towards your request limit).

If you would like to monitor your account usage and limits programmatically (how many concurrent requests you’re using, how many requests you’ve made, etc.) you may use the /account endpoint, which returns JSON.

curl "https://api.scraperapi.com/account?api_key=XYZ"

Result

{
  "concurrentRequests": 553,
  "requestCount": 6655888,
  "failedRequestCount": 1118,
  "requestLimit": 10000000,
  "concurrencyLimit": 1000
}

Ending Note

Scraper API is the best proxy API service for web scraping in the market today. Easy to integrate, able to accommodate for all levels/sizes of scraping projects. If you have any serious scraping projects, then Scraper API is definitely worth looking into. Even if you’re a casual user, you may benefit from using the free plan.

Any thoughts, let's discuss on twitter

Sharing this article is a great way to educate others like you just did.



If you’ve enjoyed this issue, do consider subscribing to my newsletter.


Subscribe to get more such interesting content !

Tech, Product, Money, Books, Life. Discover stuff, be inspired, and get ahead. Box Piper is on Twitter and Discord. Let's Connect!!

To read more such interesting topics, let's go Home

More Products from the maker of Box Piper:

Follow GitPiper Instagram account. GitPiper is the worlds biggest repository of programming and technology resources. There is nothing you can't find on GitPiper.

Follow SharkTankSeason.com. Dive into the riveting world of Shark Tank Seasons. Explore episodes, pitches, products, investment details, companies, seasons and stories of entrepreneurs seeking investment deals from sharks. Get inspired today!.


Scraper API

More Blogs from the house of Box Piper: