AX/T/01 — AX/TUTORIALS
Published: May 2, 2026 · 18 min read

How to build a competitor price monitoring bot

Production-grade price scraper with Playwright, proxy rotation, Slack alerts and change history. Step by step, with full code.

IntermediatePlaywrightNode.js / TypeScriptPostgreSQLSlack webhooksResidential proxy

Competitor price monitoring is bread and butter of e-commerce. In this tutorial we build a production-ready bot that: fetches prices for 200-500 SKUs from 3-5 competitor stores, detects price changes, saves history, sends Slack alerts when a price drops below threshold.

We say NO to solutions that die after 2 days: a single fetch() loop, one IP, no retry, no history. We say YES: idempotency, proxy rotation, schema validation, dead-letter queue.

What you need
  • Node.js 20+ or Python 3.11+
  • Basic JS/TS or Python knowledge
  • Access to residential proxy (Smartproxy, Bright Data — from $7/GB)
  • Slack workspace (optional, for alerts)
  • PostgreSQL or SQLite for price history
Steps
  1. 01

    Project setup and dependencies

    Initialize a TypeScript project with Playwright and a few helpers:

    mkdir price-monitor && cd price-monitor
    npm init -y
    npm i playwright pg zod pino
    npm i -D typescript @types/node tsx
    npx playwright install chromium

    Why these dependencies:

    • playwrightbrowser automation with auto-wait
    • pg — PostgreSQL client (better than SQLite if you plan to scale)
    • zod — runtime schema validation (catches parser drift)
    • pino — structured logging (JSON-line for observability)
  2. 02

    Schema definition and target config

    Each target (store) has its own structure — we define it declaratively:

    // config/targets.ts
    export const targets = [
      {
        id: 'amazon-de',
        baseUrl: 'https://www.amazon.de/dp/',
        selectors: {
          price: '#corePrice_feature_div .a-offscreen',
          title: '#productTitle',
          availability: '#availability',
        },
        waitFor: '#productTitle',
        proxyTier: 'residential',
      },
      // ...
    ];

    Schema for the result (Zod):

    import { z } from 'zod';
    export const PriceData = z.object({
      sku: z.string(),
      targetId: z.string(),
      price: z.number().positive(),
      currency: z.enum(['PLN', 'EUR', 'USD']),
      available: z.boolean(),
      scrapedAt: z.date(),
    });

    Schema validation is critical — when the store changes layout and a selector returns garbage, validation fails instead of pushing bad data into the database.

  3. 03

    Browser context and proxy rotation

    Playwright uses browser contexts — like separate incognito sessions with isolated cookies/storage. For each SKU we create a new context with a different proxy:

    // scraper/browser.ts
    import { chromium } from 'playwright';
    
    const proxies = [
      'http://user:pass@proxy1.smartproxy.com:7000',
      'http://user:pass@proxy2.smartproxy.com:7000',
      // ...
    ];
    
    export async function scrapeWithRotation(url, selectors) {
      const proxy = proxies[Math.floor(Math.random() * proxies.length)];
      const browser = await chromium.launch({
        headless: true,
        proxy: { server: proxy },
      });
      const context = await browser.newContext({
        userAgent: getRandomUserAgent(),
        viewport: { width: 1920, height: 1080 },
        locale: 'en-US',
      });
      // ... scrape logic
      await browser.close();
    }

    For protected sites (Amazon, eBay) add playwright-extra + puppeteer-extra-plugin-stealth to hide headless flags.

  4. 04

    Retry logic and dead-letter queue

    Every scrape can fail: timeout, captcha, network error, parser drift. Implement exponential backoff:

    async function scrapeWithRetry(target, sku, maxAttempts = 5) {
      let lastError;
      for (let attempt = 1; attempt <= maxAttempts; attempt++) {
        try {
          const data = await scrape(target, sku);
          return PriceData.parse(data); // validate
        } catch (err) {
          lastError = err;
          const delay = Math.min(2 ** attempt * 1000, 30000);
          logger.warn({ sku, attempt, err: err.message });
          await sleep(delay);
        }
      }
      // After 5 fails → dead-letter queue
      await db.query(
        'INSERT INTO scrape_dlq (sku, target, error, failed_at) VALUES ($1, $2, $3, NOW())',
        [sku, target.id, lastError.message]
      );
      throw lastError;
    }

    DLQ with Slack alert when the list grows above threshold — you know something broke before the client asks "where are my reports".

  5. 05

    Storage and price change detection

    Each scrape is a new row in the history table. Price change detection = comparison vs previous:

    CREATE TABLE price_history (
      id BIGSERIAL PRIMARY KEY,
      sku TEXT NOT NULL,
      target_id TEXT NOT NULL,
      price NUMERIC(10,2) NOT NULL,
      currency TEXT NOT NULL,
      available BOOLEAN NOT NULL,
      scraped_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
    );
    CREATE INDEX idx_price_lookup ON price_history (sku, target_id, scraped_at DESC);

    Query for drop detection:

    SELECT current.sku, current.price, prev.price as prev_price,
           ((current.price - prev.price) / prev.price * 100) as pct_change
    FROM price_history current
    JOIN LATERAL (
      SELECT price FROM price_history
      WHERE sku = current.sku AND target_id = current.target_id
        AND scraped_at < current.scraped_at
      ORDER BY scraped_at DESC LIMIT 1
    ) prev ON true
    WHERE current.scraped_at > NOW() - INTERVAL '1 hour'
      AND ((current.price - prev.price) / prev.price) < -0.05;

    Each row in the result = drop >5%. Slack alert.

  6. 06

    Slack alerts and scheduling

    Slack webhook (incoming webhook URL from Slack App):

    async function sendSlackAlert(changes) {
      await fetch(process.env.SLACK_WEBHOOK_URL, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          text: `Price drops detected: ${changes.length} SKUs`,
          blocks: changes.map(c => ({
            type: 'section',
            text: { type: 'mrkdwn',
              text: `• ${c.sku} (${c.target_id}): ${c.prev_price} → ${c.price} (${c.pct_change.toFixed(1)}%)`
            }
          }))
        })
      });
    }

    Scheduling: cron on a VPS or BullMQ + Redis for distributed queue. Every 30 min:

    // cron: */30 * * * *
    import { scrapeAll } from './scraper';
    import { detectChanges } from './detect';
    import { sendSlackAlert } from './alerts';
    
    (async () => {
      await scrapeAll(targets, getSkuList());
      const changes = await detectChanges();
      if (changes.length > 0) await sendSlackAlert(changes);
    })();
What it costs to run

Run cost for 500 SKUs, scrape every 30 min (24/7):

  • Proxy (residential, ~3MB per page × 500 SKUs × 48 runs/day × 30 days = 22GB/month): ~$150-180/month at Smartproxy
  • VPS (Hetzner CX22, 2vCPU/4GB): €5/month
  • Postgres (Supabase Free tier suffices up to 500MB): $0
  • Slack: $0 (free tier)

Total: ~$160-185/month operating cost for 500 SKUs. Scales linearly with proxy bandwidth.

Common pitfalls
  • No schema validation — when the store changes layout, the scraper starts writing null as the price. Your analytics breaks silently.
  • Single IP — after 50-200 requests Cloudflare/Akamai will block. Proxy rotation is required not optional.
  • No retry logic — transient errors (timeout, network) blow up the whole batch instead of a single SKU.
  • Hard-coded selectors in code — when they change, you must redeploy. Better in config file or database.
  • No success rate monitoring — you do not know that 30% of scrapes return garbage. Log success/fail per target.
Build yourself or hire?

The above stack handles 500-2000 SKUs from 3-5 stores without issues. For 10k+ SKUs you add: queue system (BullMQ), distributed workers, observability (Grafana + Prometheus), more advanced anti-bot (mobile proxy for the hardest targets).

If your use-case needs scale, compliance (GDPR), 99.9% uptime SLA, or you just do not have a developer who will own this — talk to us. We run this for 14+ clients in production.

Frequently asked questions
Is competitor price monitoring legal?
Yes, for publicly available price data — the hiQ vs LinkedIn 2022 precedent confirms the right to scrape public data. At the same time you violate store ToS (Amazon, eBay, etc.), so the risk is IP/account blocking, not lawsuits. Compliance-wise: do not scrape behind login walls, do not overload servers, respect robots.txt for SEO-related crawling.
How many SKUs can I monitor with one bot?
500-2000 SKUs from 3-5 stores on one VPS (Hetzner CX22, 2GB RAM) without issues. Above 5000 SKUs you add a distributed queue (BullMQ + Redis), workers across multiple machines, dedicated proxy per worker. The limit is not CPU/RAM but proxy bandwidth.
What happens when Amazon or eBay changes their page layout?
Your selectors break — scrape returns null/undefined. That is why step 2 (schema validation with Zod) is critical: catch fails fast instead of pushing garbage to the database. Under retainer we fix in 24-48h. Without retainer — you must monitor success rate and fix yourself.
Are residential proxies really necessary?
For Amazon/eBay/major e-commerce — yes. Datacenter proxies get blocked by Cloudflare/Akamai/DataDome within 50-200 requests. Smartproxy/Bright Data residential ~$7-12/GB, monthly $50-200 for a typical setup. For smaller stores without aggressive anti-bot — datacenter suffices.