Tracking 100 SKUs twice a day, 1400 total requests, 0 IP bans.
The challenge
We monitored 100 Amazon product pages every 12 hours for a full week, collecting every price, stock, and rating change – 1400 scrape attempts against one of the hardest sites on the web. Success meant two things:
- Stay invisible. Evade TLS, token, and behavioural checks.
- Stay consistent. Capture every change despite layout shifts.
Methodology (quick stats)
<table class="GeneratedTable">
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Products tracked</td>
<td>100</td>
</tr>
<tr>
<td>Requests</td>
<td>1400</td>
</tr>
<tr>
<td>Duration</td>
<td>7 days</td>
</tr>
<tr>
<td>Proxy type</td>
<td>Massive residential</td>
</tr>
<tr>
<td>HTTP client</td>
<td><em>curl_cffi</em> Python</td>
</tr>
</tbody>
</table>

Key findings
- 41.9 % – biggest weekly price jump (guitar)
- 14 % of SKUs changed price at least once
- 22 % of SKUs showed a visible change (price, rating, or stock)
- 0 IP bans with rotating residential proxies + TLS impersonation
Amazon’s defence stack
- TLS fingerprint gate. Every request’s JA3 and JA4 hashes are checked against allowed Chrome/Firefox patterns; mismatches are scored or blocked before headers are even parsed.
- Encrypted browser token. A silent JavaScript challenge issues an aws-waf-token that bundles canvas, WebGL, timezone, and touch-event entropy; traffic without a fresh, valid token is challenged or dropped.
- AWS WAF Bot Control (ML-driven). Real-time machine-learning models watch click-paths and request cadence; anomalous sessions are forced through CAPTCHA or rate-limited automatically.
- Adaptive rate limiting. Limits aren’t just “N requests per IP”; Amazon can throttle on composite keys such as JA3 + method or ZIP + ASIN, stopping residential proxy swarms that rotate slowly.
Note: Avoid generic fake User-Agent libraries as they pull random UAs from public lists. Roughly half of the pool is mobile / Linux. If you build selectors on Windows or Mac, but the next request goes out as iPhone Safari, you’ll land on the mobile DOM, and your selectors will miss.
Key discoveries from the data
See the Amazon testing results chart below for a visual breakdown:

Here’s the table with more details.
<table class="GeneratedTable">
<thead>
<tr>
<th>Insight</th>
<th>Detail</th>
</tr>
</thead>
<tbody>
<tr>
<td>Price</td>
<td>14% of SKUs repriced; top jump +41.9%.</td>
</tr>
<tr>
<td>Inventory</td>
<td>2% of SKUs toggled between “Only 1 left” and normal stock.</td>
</tr>
<tr>
<td>Rating</td>
<td>6% of SKUs shifted 0.1–0.3 stars.</td>
</tr>
</tbody>
</table>
And, 22 % of SKUs changed either price, rating, or stock at least once during the 7-day window.
Lessons & best practices
- Residential proxies for tough targets – datacenter proxies are fine for low-risk sites, but on Amazon-class defences, they rack up more retries than savings.
- Quality over volume – a curated pool of clean, high-reputation IPs outperforms thousands of mystery addresses.
- Behavioural mimicry beats speed – human-paced requests, short browsing sessions, and realistic fingerprints reduce blocks far more than brute-force frequency.
Broader applications
These same tactics solve other high-defence scenarios:
- Shopify Plus stores that run flash-sale bot protection.
- Regional marketplaces with location-based pricing rules.
- Booking engines and finance portals that gate content by geography.

Je suis le co-fondateur et PDG de Massive. En plus de travailler sur des startups, je suis musicienne, athlète, mentor, animatrice d'événements et bénévole.

.jpg)




