scrapingpros

Ban handling API for web scraping: Extract data from any site

Our band handling API is the ultimate solution for advanced, efficient, and seamless web scraping with a unique ban-handling technology. Designed by and for data professionals, empowering you to extract the data you need while overcoming the challenges of avoiding bans, IP blocks, and captchas.

Kickstart your data scraping journey with 10,000 free credits. Test our API and discover how it can transform your data extraction processes.

Ban handling

Headless browser

Residential proxies

Flexible requests

Scalable

Smart techniques to bypass blocks and rate limits.

Designed for seamless integration into your operations, offering a dependable and scalable data stream. Scraping Pros stands out in the field of web scraping with our comprehensive solutions that tackle common challenges associated with data extraction and ban handling.

Advanced Ban Handling

Scrape without interruptions. Our API employs intelligent proxy rotation to bypass blocks, and sophisticated antiban strategies to ensure uninterrupted data collection.

Headless Browser Support

Handle JavaScript-heavy websites effortlessly with our headless browser and rendering capabilities, ensuring compatibility with modern, dynamic websites.

Data Accuracy

Our API delivers real-time updates. Receive validated, structured, and high-quality data.

Security & Compliance

Your data and privacy are our priorities. We comply with privacy regulations, meeting industry standards for ethical data use.

Scalable & Customizable

Adapt the API to your needs. Tailor scraping settings for different website complexities with custom endpoints, and experience flexible integration.

What sets us apart: Scraping Pros differential

Scraping Pros API

Other tools

Customization Level

High: Tailor requests, headers, and proxies.

Limited, predefined settings

Anti-Block Measures

Advanced antiblocking and captchas solved.

Basic measures.

Scalability

Easily handles millions of requests.

Often suited for small tasks.

Integration Ease

RESTful API, clear docs, and examples.

Complex or limited options.

API packages for every project

All plans include

HEADLESS BROWSER

RESIDENTIAL PROXIES

1 MONTH DATA RETENTION

SUPPORT & HELP DESK

API Trial

Free

  • 10,000 API Credits
  • 5 GB Storage

Hobby

$49

/ Month

  • 100,000 Credits
  • 10 GB Storage

Startup

$149

/ Month

  • 1,000,000 Credits
  • 30 GB Storage

Business

$299

/ Month

  • 3,000,000 Credits
  • 100 GB Storage

Enterprise

Custom

  • Unlimited credits
  • Premium features
  • Dedicated support

Explore some use cases

E-commerce Opt.

Monitor competitor pricing with automated, scalable scraping.

Market Research

Aggregate data from multiple sources to analyze trends.

Content Aggregation

Pull metadata (titles, descriptions) for news articles or blogs.

Price monitoring

Set up dynamic scraping pipelines to adjust pricing strategies in real time.

Academic Research

Gather data from research papers, and academic databases.

Job Boards

Collect job postings from various job boards and company career pages.

Brand Monitoring

Monitor online mentions across news, and review platforms.

Real Estate Analysis

Gather data on property listings from real estate websites.

Financial Data

Collect financial news, stock prices, and market data from financial sites.

Product Research

Gather customer reviews and competitor product information.

Competitive Int.

Gather data on competitors' products, pricing, and strategies.

Content Curation

Find relevant articles, images, and other content for your website.

Technical Documentation

Dive into our API Docs for everything you need to get started

Quick start guide

Create a project

  • Endpoint: POST /projects
  • Description: Creates a new project with a specified name and priority.
  • Request:
{
  "name": "string",
  "priority": "integer",
  "description": "string"
}
  • Response:
{
    "message": "string",
    "project": {
        "id": "integer",
        "client_id": "integer",
        "name": "string",
        "description": "string",
        "cost": "integer",
        "priority": "integer",
        "status": "string",
        "created_at": "string",
        "updated_at": "string"
    }
}
  • Example:
curl -X POST https://api.scrapingpros.com/api/projects \
  -H 'Content-Type: application/json' \
  --header 'Authorization: Bearer "Your-api-token"' \
  • Example Response:
{
    "message": "Project successfully created.",
    "project": {
        "id": "Project assigned id",
        "client_id": "Your client ID",
        "name": "New Project",
        "description": "Description",
        "cost": null,
        "priority": 5,
        "status": "A",
        "created_at": "2024-09-23T14:17:57.000Z",
        "updated_at": "2024-09-23T14:17:57.000Z"
    }
}

Create a batch

  • Endpoint: POST /batches
  • Description: Creates an empty batch for a project, jobs need to be added later using append-jobs endpoint.
  • Request:
{
    "project": "integer",
    "name" : "string",
    "status" : "string",
    "priority" : "integer",
    "max_requests" : "integer"
}
  • Response:
{
    "message": "string",
    "project": {
        "id": "integer",
        "client_id": "integer",
        "name": "string",
        "description": "string",
        "cost": "integer",
        "priority": "integer",
        "status": "string",
        "created_at": "string",
        "updated_at": "string"
    }
}
  • Example:
curl -X POST https://api.scrapingpros.com/batches \
  -H 'Content-Type: application/json' \
  --header 'Authorization: Bearer "Your-api-token"' \
  -d '{
    "project": 32,
    "name" : "First batch",
    "priority" : 5,
    "max_requests" : 1000
  }'
  • Example Response:
{
    "message": "ok",
    "batch": {
        "id": 105,
        "client_id": 16,
        "project_id": 32,
        "name": "First batch",
        "cost": 0,
        "priority": 5,
        "status": "A",
        "max_requests": 1000,
        "created_at": "2024-09-23T15:32:41.000Z",
        "updated_at": "2024-09-23T15:32:41.000Z"
    }
}

Append jobs to batch

  • Endpoint: POST /batches/[batch_id]/append-jobs
  • Description: Creates an empty batch for a project, jobs need to be added later using append-jobs endpoint.
  • Request:
{
    "jobs": [
        {
            "url": "string",         // Required: The URL to scrape.
            "scrap_mode": "integer", // Required: The scraping mode ID. See Scrap Modes Available.
            "Arguments": {
                "action": "ScrollToBottom"
            }                       // Optional: Additional parameters for the job. This parameter indicates Puppeteer to scroll the page.
        }
    ]
}
  • Response:
{

    "response": {
        "total_jobs": "integer",
        "total_invalid_jobs": "integer",
        "invalid_jobs": "Array"
    }

}
  • Actions

We can input actions in the “Arguments” part of the job, allowing us to perform different actions while scraping. Currently, all available actions are:

  1. ScrollToBottom. Keeps scrolling until reaching the end of the page.
  2. delayms. Tells the scraper to wait for an arbitrary amount of ms, which can be used to bypass human-detection checks.
  3. Clickxpath. Clicks on the xpath.
  4. TypeOnInputxpath. Writes on the selected field.
  • Scrap Modes Available

Here are the available scrap modes along with their corresponding IDs and credits required:

Scrap modes

Scrap Mode  —— ID  — Credits
HTTP Requests  — 2 ——- 1
Selenium ———— 5 ——- 5
Puppeteer ———– 7 ——- 5
  • Example:
I am using scrap_mode 2: HTTP Requests
curl -X POST https://api.scrapingpros.com/batches/104/append-jobs \
  -H 'Content-Type: application/json' \
  --header 'Authorization: Bearer "Your-api-token"' \
  -d '{
    "jobs": [
      {
        "url": "http://example.com/page1",
        "scrap_mode": 2,
        "arguments": {}
      }
      ]
  }'
  • Example Response:
{
{
    "response": {
        "total_jobs": 2,
        "total_invalid_jobs": 0,
        "invalid_jobs": []
    }
}
}

Run the batch

  • Endpoint: POST /batches/[batch_id]/run
  • Description: Sets all jobs to run for a specified batch.
  • Response:
{
    "message": "string",
    "batch": {
        "id": "integer",
        "client_id": "integer",
        "name": "string",
        "description": "string",
        "cost": "integer",
        "priority": "integer",
        "status": "string",
        "created_at": "string",
        "updated_at": "string"
    }
}
  • Example:
curl -X POST https://api.scrapingpros.com/batches/105/run \
    --header 'Authorization: Bearer Your-api-token' \
  • Example Response:
{
    "message": "batch id: 105 was set to run successfully. 45 jobs set to run"",
    "batch": {
        "id": 105,
        "client_id": 16,
        "project_id": 32,
        "name": "First batch",
        "cost": 0,
        "priority": 5,
        "status": "A",
        "max_requests": 1000,
        "created_at": "2024-09-23T15:32:41.000Z",
        "updated_at": "2024-09-23T15:32:41.000Z"
            }
}

Specifications and security measures

As a leading web scraping company in the USA, Scraping Pros is committed to integrating seamlessly into your existing workflows while maintaining high standards of security and performance.

Performance Metrics

Experience low response times, high throughput, and reliable uptime.

Access Control

Manage your activity with dashboard management and usage monitoring.

API Endpoints

Detailed list of endpoints with methods, parameters, and examples to facilitate smooth integration.

Get started today

Stop worrying about blocked requests and messy data. Extract clean, structured data effortlessly with Scraping Pros API.