API & TechnicalPythonAmazonAPI

How to scrape Amazon prices with Python

Learn how to scrape Amazon product prices using Python and the ShoppingScraper API. This tutorial covers single product lookups, batch scraping, price history tracking, and buy box monitoring across 16 Amazon country domains.

By Tim Hagebols

Why scrape Amazon prices?

Amazon is the world's largest online marketplace, and its pricing data is invaluable for e-commerce businesses. Whether you are a brand monitoring unauthorized resellers, a retailer tracking competitor prices, or a data analyst researching market trends, programmatic access to Amazon pricing data gives you a decisive edge.

Common use cases for Amazon price scraping include:

  • Competitor monitoring — Track how competitors price the same products across Amazon domains
  • Buy box tracking — Monitor which seller holds the buy box and at what price
  • Price history analysis — Build historical price charts to identify trends and seasonal patterns
  • Dynamic pricing — Feed real-time Amazon prices into your own repricing engine
  • Multi-country comparison — Compare prices for the same product across Amazon.nl, Amazon.de, Amazon.fr, and more

Scraping Amazon directly is fragile and prone to blocks. The ShoppingScraper API handles the scraping infrastructure for you and returns clean, structured JSON data. This tutorial shows you how to use it with Python.

Prerequisites

Before you start, make sure you have:

  1. Python 3.9+ installed
  2. httpx — a modern async HTTP client for Python
  3. A ShoppingScraper API key — sign up at shoppingscraper.com for 100 free credits

Install httpx if you do not have it yet:

pip install httpx

Set your API key as an environment variable so it stays out of your source code:

export SHOPPINGSCRAPER_API_KEY="your-api-key-here"

Single product lookup

The simplest use case is looking up offers for a single product by its EAN code. The /offers endpoint returns all sellers, prices, and availability data from a specific Amazon domain.

import os
import httpx
 
API_KEY = os.environ["SHOPPINGSCRAPER_API_KEY"]
BASE_URL = "https://api.shoppingscraper.com"
 
def get_offers(ean: str, site: str = "amazon.nl") -> dict:
    """Fetch all offers for a product on a specific Amazon domain."""
    response = httpx.get(
        f"{BASE_URL}/offers",
        params={"site": site, "ean": ean},
        headers={"x-api-key": API_KEY},
        timeout=httpx.Timeout(connect=5.0, read=60.0, write=5.0, pool=5.0),
    )
    response.raise_for_status()
    return response.json()
 
# Look up a product on Amazon Netherlands
result = get_offers(ean="0194253397052", site="amazon.nl")
 
print(f"Product: {result.get('title', 'N/A')}")
print(f"Number of offers: {len(result.get('offers', []))}")
 
for offer in result.get("offers", []):
    print(f"  Seller: {offer.get('seller', 'N/A')}")
    print(f"  Price: EUR {offer.get('price', 'N/A')}")
    print(f"  Condition: {offer.get('condition', 'N/A')}")
    print()

Example response from the API:

{
  "title": "Apple AirPods Pro (2nd generation)",
  "ean": "0194253397052",
  "url": "https://www.amazon.nl/dp/B0BDHWDR12",
  "image": "https://m.media-amazon.com/images/I/61SUj2aKoEL._AC_SL1500_.jpg",
  "offers": [
    {
      "seller": "Amazon.nl",
      "price": 279.0,
      "currency": "EUR",
      "condition": "New",
      "delivery": "Free delivery",
      "prime": true
    },
    {
      "seller": "ElectroShop NL",
      "price": 284.95,
      "currency": "EUR",
      "condition": "New",
      "delivery": "EUR 3.99 delivery",
      "prime": false
    }
  ]
}

Batch scraping multiple products

When you need to monitor prices for a catalog of products, loop over a list of EANs. Use httpx.AsyncClient with a semaphore to control concurrency and avoid overwhelming the API.

import asyncio
import os
import httpx
 
API_KEY = os.environ["SHOPPINGSCRAPER_API_KEY"]
BASE_URL = "https://api.shoppingscraper.com"
 
async def get_offers_async(
    client: httpx.AsyncClient,
    semaphore: asyncio.Semaphore,
    ean: str,
    site: str = "amazon.nl",
) -> dict:
    """Fetch offers for a single product with concurrency control."""
    async with semaphore:
        response = await client.get(
            f"{BASE_URL}/offers",
            params={"site": site, "ean": ean},
            timeout=httpx.Timeout(connect=5.0, read=60.0, write=5.0, pool=5.0),
        )
        response.raise_for_status()
        return {"ean": ean, "data": response.json()}
 
async def batch_scrape(eans: list[str], site: str = "amazon.nl") -> list[dict]:
    """Scrape prices for multiple EANs concurrently."""
    semaphore = asyncio.Semaphore(25)
    async with httpx.AsyncClient(
        headers={"x-api-key": API_KEY},
    ) as client:
        tasks = [get_offers_async(client, semaphore, ean, site) for ean in eans]
        results = await asyncio.gather(*tasks, return_exceptions=True)
 
    successful = []
    for result in results:
        if isinstance(result, Exception):
            print(f"Error: {result}")
        else:
            successful.append(result)
 
    return successful
 
# Example: scrape 5 products
eans = [
    "0194253397052",  # AirPods Pro
    "8710103990741",  # Philips shaver
    "4006381333931",  # Braun trimmer
    "8719743441361",  # JBL speaker
    "0889842747461",  # Xbox controller
]
 
results = asyncio.run(batch_scrape(eans))
 
print(f"\nSuccessfully scraped {len(results)} out of {len(eans)} products\n")
for item in results:
    data = item["data"]
    offers = data.get("offers", [])
    lowest = min((o["price"] for o in offers if o.get("price")), default=None)
    print(f"EAN {item['ean']}: {data.get('title', 'N/A')} — Lowest: EUR {lowest}")

Price history tracking

To track prices over time, schedule your script with cron and store results in a SQLite database. This builds a price history you can use for trend analysis and alerts.

import os
import sqlite3
from datetime import datetime, timezone
 
import httpx
 
API_KEY = os.environ["SHOPPINGSCRAPER_API_KEY"]
BASE_URL = "https://api.shoppingscraper.com"
DB_PATH = "price_history.db"
 
def init_db() -> None:
    """Create the price history table if it does not exist."""
    conn = sqlite3.connect(DB_PATH)
    conn.execute("PRAGMA journal_mode=WAL")
    conn.execute("""
        CREATE TABLE IF NOT EXISTS price_history (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            ean TEXT NOT NULL,
            site TEXT NOT NULL,
            title TEXT,
            seller TEXT,
            price REAL,
            currency TEXT,
            is_buybox INTEGER DEFAULT 0,
            scraped_at TEXT NOT NULL
        )
    """)
    conn.execute("""
        CREATE INDEX IF NOT EXISTS idx_ean_site_date
        ON price_history (ean, site, scraped_at)
    """)
    conn.commit()
    conn.close()
 
def store_offers(ean: str, site: str, data: dict) -> int:
    """Store all offers in the database. Returns the number of rows inserted."""
    conn = sqlite3.connect(DB_PATH)
    now = datetime.now(timezone.utc).isoformat()
    title = data.get("title", "")
    offers = data.get("offers", [])
    rows = 0
 
    for i, offer in enumerate(offers):
        conn.execute(
            """INSERT INTO price_history
               (ean, site, title, seller, price, currency, is_buybox, scraped_at)
               VALUES (?, ?, ?, ?, ?, ?, ?, ?)""",
            (
                ean,
                site,
                title,
                offer.get("seller", ""),
                offer.get("price"),
                offer.get("currency", "EUR"),
                1 if i == 0 else 0,  # First offer is typically the buy box winner
                now,
            ),
        )
        rows += 1
 
    conn.commit()
    conn.close()
    return rows
 
def scrape_and_store(ean: str, site: str = "amazon.nl") -> None:
    """Scrape offers and store them in the database."""
    response = httpx.get(
        f"{BASE_URL}/offers",
        params={"site": site, "ean": ean},
        headers={"x-api-key": API_KEY},
        timeout=httpx.Timeout(connect=5.0, read=60.0, write=5.0, pool=5.0),
    )
    response.raise_for_status()
    data = response.json()
    rows = store_offers(ean, site, data)
    print(f"[{datetime.now(timezone.utc).isoformat()}] Stored {rows} offers for EAN {ean}")
 
# Initialize the database
init_db()
 
# Scrape and store
eans_to_track = ["0194253397052", "8710103990741"]
for ean in eans_to_track:
    scrape_and_store(ean)

Schedule this script with cron to run every hour:

# Edit your crontab
crontab -e
 
# Add this line to run every hour
0 * * * * cd /path/to/project && source .venv/bin/activate && python price_tracker.py

Buy box monitoring

The buy box is the most important position on an Amazon product page. The /buybox endpoint gives you direct access to the current buy box winner, price, and seller information.

import os
import httpx
 
API_KEY = os.environ["SHOPPINGSCRAPER_API_KEY"]
BASE_URL = "https://api.shoppingscraper.com"
 
def get_buybox(ean: str, site: str = "amazon.nl") -> dict:
    """Fetch the current buy box data for a product."""
    response = httpx.get(
        f"{BASE_URL}/buybox",
        params={"site": site, "ean": ean},
        headers={"x-api-key": API_KEY},
        timeout=httpx.Timeout(connect=5.0, read=60.0, write=5.0, pool=5.0),
    )
    response.raise_for_status()
    return response.json()
 
# Check the buy box on Amazon Netherlands
buybox = get_buybox(ean="0194253397052", site="amazon.nl")
 
print(f"Product: {buybox.get('title', 'N/A')}")
print(f"Buy Box Seller: {buybox.get('seller', 'N/A')}")
print(f"Buy Box Price: EUR {buybox.get('price', 'N/A')}")
print(f"Prime: {buybox.get('prime', False)}")
print(f"In Stock: {buybox.get('in_stock', 'N/A')}")

Example response:

{
  "title": "Apple AirPods Pro (2nd generation)",
  "ean": "0194253397052",
  "seller": "Amazon.nl",
  "price": 279.0,
  "currency": "EUR",
  "prime": true,
  "in_stock": true,
  "url": "https://www.amazon.nl/dp/B0BDHWDR12"
}

You can combine buy box monitoring with the price history tracker from the previous section to build a complete view of buy box ownership over time.

Multi-country price comparison

One of the most powerful features of the ShoppingScraper API is the ability to scrape the same product across multiple Amazon country domains. This enables cross-border pricing analysis.

import asyncio
import os
import httpx
 
API_KEY = os.environ["SHOPPINGSCRAPER_API_KEY"]
BASE_URL = "https://api.shoppingscraper.com"
 
AMAZON_DOMAINS = [
    "amazon.nl",
    "amazon.de",
    "amazon.fr",
    "amazon.es",
    "amazon.it",
    "amazon.co.uk",
    "amazon.com",
]
 
async def compare_prices_across_countries(ean: str) -> list[dict]:
    """Compare prices for the same product across multiple Amazon domains."""
    semaphore = asyncio.Semaphore(25)
 
    async def fetch(client: httpx.AsyncClient, site: str) -> dict:
        async with semaphore:
            try:
                response = await client.get(
                    f"{BASE_URL}/offers",
                    params={"site": site, "ean": ean},
                    timeout=httpx.Timeout(connect=5.0, read=60.0, write=5.0, pool=5.0),
                )
                response.raise_for_status()
                data = response.json()
                offers = data.get("offers", [])
                lowest = min((o["price"] for o in offers if o.get("price")), default=None)
                return {
                    "site": site,
                    "title": data.get("title", "N/A"),
                    "lowest_price": lowest,
                    "currency": offers[0].get("currency", "N/A") if offers else "N/A",
                    "num_sellers": len(offers),
                }
            except httpx.HTTPStatusError:
                return {"site": site, "error": "Product not found on this domain"}
 
    async with httpx.AsyncClient(headers={"x-api-key": API_KEY}) as client:
        tasks = [fetch(client, site) for site in AMAZON_DOMAINS]
        return await asyncio.gather(*tasks)
 
# Compare AirPods Pro prices across Amazon domains
results = asyncio.run(compare_prices_across_countries("0194253397052"))
 
print("Cross-border price comparison for EAN 0194253397052\n")
print(f"{'Domain':<20} {'Lowest Price':<15} {'Sellers':<10}")
print("-" * 45)
 
for r in results:
    if "error" in r:
        print(f"{r['site']:<20} {'N/A':<15} {'N/A':<10}")
    else:
        price_str = f"{r['currency']} {r['lowest_price']}" if r["lowest_price"] else "N/A"
        print(f"{r['site']:<20} {price_str:<15} {r['num_sellers']:<10}")

Example output:

Cross-border price comparison for EAN 0194253397052

Domain               Lowest Price    Sellers
---------------------------------------------
amazon.nl            EUR 279.0       4
amazon.de            EUR 269.0       12
amazon.fr            EUR 274.99      8
amazon.es            EUR 269.0       6
amazon.it            EUR 274.99      7
amazon.co.uk         GBP 229.0       9
amazon.com           USD 249.99      15

Full working script

Here is a complete, production-ready script that combines all the techniques above. It reads EANs from a CSV file, scrapes prices across multiple domains, stores results in SQLite, and prints a summary.

"""
Amazon price scraper using ShoppingScraper API.
 
Setup:
    source .venv/bin/activate && pip install httpx
 
Usage:
    export SHOPPINGSCRAPER_API_KEY="your-key"
    python amazon_scraper.py --eans 0194253397052,8710103990741 --sites amazon.nl,amazon.de
    python amazon_scraper.py --file products.csv --sites amazon.nl
"""
 
import argparse
import asyncio
import os
import sqlite3
import sys
from datetime import datetime, timezone
 
import httpx
 
API_KEY = os.environ.get("SHOPPINGSCRAPER_API_KEY", "")
BASE_URL = "https://api.shoppingscraper.com"
DB_PATH = "amazon_prices.db"
 
def validate_ean(ean: str) -> str:
    """Validate and normalize an EAN to 13 digits."""
    ean = ean.strip()
    if not ean.isdigit():
        raise ValueError(f"EAN must be numeric, got: {ean}")
    if len(ean) < 13:
        ean = ean.zfill(13)
    return ean
 
def init_db() -> None:
    """Initialize the SQLite database."""
    conn = sqlite3.connect(DB_PATH)
    conn.execute("PRAGMA journal_mode=WAL")
    conn.execute("""
        CREATE TABLE IF NOT EXISTS scrape_results (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            ean TEXT NOT NULL,
            site TEXT NOT NULL,
            title TEXT,
            seller TEXT,
            price REAL,
            currency TEXT,
            condition TEXT,
            prime INTEGER DEFAULT 0,
            scraped_at TEXT NOT NULL
        )
    """)
    conn.execute("""
        CREATE INDEX IF NOT EXISTS idx_ean_site
        ON scrape_results (ean, site, scraped_at)
    """)
    conn.commit()
    conn.close()
 
async def fetch_offers(
    client: httpx.AsyncClient,
    semaphore: asyncio.Semaphore,
    ean: str,
    site: str,
) -> dict:
    """Fetch offers for one EAN on one site."""
    async with semaphore:
        try:
            response = await client.get(
                f"{BASE_URL}/offers",
                params={"site": site, "ean": ean},
                timeout=httpx.Timeout(connect=5.0, read=60.0, write=5.0, pool=5.0),
            )
            response.raise_for_status()
            return {"ean": ean, "site": site, "data": response.json(), "ok": True}
        except httpx.HTTPStatusError as e:
            return {"ean": ean, "site": site, "error": str(e), "ok": False}
        except httpx.TimeoutException:
            return {"ean": ean, "site": site, "error": "Timeout", "ok": False}
 
def store_results(results: list[dict]) -> int:
    """Store successful results in the database."""
    conn = sqlite3.connect(DB_PATH)
    now = datetime.now(timezone.utc).isoformat()
    total_rows = 0
 
    for result in results:
        if not result["ok"]:
            continue
        data = result["data"]
        for offer in data.get("offers", []):
            conn.execute(
                """INSERT INTO scrape_results
                   (ean, site, title, seller, price, currency, condition, prime, scraped_at)
                   VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)""",
                (
                    result["ean"],
                    result["site"],
                    data.get("title", ""),
                    offer.get("seller", ""),
                    offer.get("price"),
                    offer.get("currency", ""),
                    offer.get("condition", ""),
                    1 if offer.get("prime") else 0,
                    now,
                ),
            )
            total_rows += 1
 
    conn.commit()
    conn.close()
    return total_rows
 
async def main(eans: list[str], sites: list[str]) -> None:
    """Run the scraper."""
    if not API_KEY:
        print("Error: SHOPPINGSCRAPER_API_KEY environment variable not set")
        sys.exit(1)
 
    # Validate EANs
    validated_eans = []
    for ean in eans:
        try:
            validated_eans.append(validate_ean(ean))
        except ValueError as e:
            print(f"Skipping invalid EAN: {e}")
 
    if not validated_eans:
        print("No valid EANs to scrape")
        sys.exit(1)
 
    print(f"Scraping {len(validated_eans)} EANs across {len(sites)} Amazon domains...")
    print(f"Total requests: {len(validated_eans) * len(sites)}\n")
 
    init_db()
 
    semaphore = asyncio.Semaphore(25)
    async with httpx.AsyncClient(headers={"x-api-key": API_KEY}) as client:
        tasks = [
            fetch_offers(client, semaphore, ean, site)
            for ean in validated_eans
            for site in sites
        ]
        results = await asyncio.gather(*tasks)
 
    # Store results
    rows_stored = store_results(results)
 
    # Print summary
    successful = [r for r in results if r["ok"]]
    failed = [r for r in results if not r["ok"]]
 
    print(f"Finished scraping")
    print(f"  Successful: {len(successful)}/{len(results)}")
    print(f"  Failed: {len(failed)}/{len(results)}")
    print(f"  Rows stored: {rows_stored}")
    print(f"  Database: {DB_PATH}\n")
 
    # Print price summary
    for ean in validated_eans:
        ean_results = [r for r in successful if r["ean"] == ean]
        if not ean_results:
            continue
        title = ean_results[0]["data"].get("title", "Unknown")
        print(f"  {title} (EAN: {ean})")
        for r in ean_results:
            offers = r["data"].get("offers", [])
            lowest = min((o["price"] for o in offers if o.get("price")), default=None)
            if lowest is not None:
                currency = offers[0].get("currency", "")
                print(f"    {r['site']}: {currency} {lowest} ({len(offers)} sellers)")
            else:
                print(f"    {r['site']}: No offers found")
        print()
 
    if failed:
        print("Failed requests:")
        for f in failed:
            print(f"  {f['ean']} on {f['site']}: {f['error']}")
 
if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Scrape Amazon prices via ShoppingScraper API")
    parser.add_argument("--eans", type=str, help="Comma-separated list of EANs")
    parser.add_argument("--file", type=str, help="Path to a text file with one EAN per line")
    parser.add_argument(
        "--sites",
        type=str,
        default="amazon.nl,amazon.de",
        help="Comma-separated Amazon domains (default: amazon.nl,amazon.de)",
    )
    args = parser.parse_args()
 
    if args.eans:
        ean_list = [e.strip() for e in args.eans.split(",") if e.strip()]
    elif args.file:
        with open(args.file) as f:
            ean_list = [line.strip() for line in f if line.strip()]
    else:
        print("Provide --eans or --file")
        sys.exit(1)
 
    site_list = [s.strip() for s in args.sites.split(",") if s.strip()]
    asyncio.run(main(ean_list, site_list))

Next steps

You now have everything you need to scrape Amazon prices with Python. Here are some ways to go further:

  • Explore the full API — The API documentation covers all 34 endpoints including search, page scraping, and AI tools
  • Set up automated monitoring — Use ShoppingScraper's built-in schedulers to run recurring price checks without managing your own cron jobs
  • Try the API Playground — Test any endpoint interactively in your browser at the API Playground
  • Monitor the buy box — Read more about buy box monitoring strategies
  • Compare across marketplaces — ShoppingScraper supports 65+ marketplaces beyond Amazon, including Google Shopping, Bol.com, and dozens of European retailers
  • Connect with Claude Desktop — Use the MCP integration to query pricing data directly from your AI assistant
TH

CTO & Co-founder

Full-stack engineer specializing in web scraping, API design, and AI applications for e-commerce. Built ShoppingScraper's infrastructure processing 1M+ daily product lookups.

Ready to try ShoppingScraper?

Start with 100 free API calls. No credit card required.