May 2026 · Top-100 e-commerce audit live

Make your site bot-readable. Without rewriting it.

AI shopping queries answer themselves from whatever HTML your server returns first — and most JavaScript apps return an empty shell. PrerenderProxy is the open, edge-hosted prerender layer that gives every bot the same fully-formed page your customers see. Built for the post-deprecation, AI-crawler era.

What we found — top 100 e-commerce sites, May 2026

62 / 100
block at least one AI crawler outright
14 / 100
are fully AI-ready — every bot gets parseable HTML
30×
more pre-rendered bytes Amazon UK serves Googlebot vs a real user
4 / 6
single-bot ClaudeBot blockers are eBay properties

Raw data, screenshots, methodology, per-site cards → audit/2026-05-ecommerce-100

01 · What it is

One HTML response. Every crawler. No rewrite.

PrerenderProxy runs at the edge of your existing CDN. It detects every legitimate search and AI crawler — Google, Bing, GPTBot, ChatGPT-User, ClaudeBot, Claude-User, Perplexity, Applebot — and serves them a Puppeteer-rendered HTML snapshot of the same page your customers see. Your SPA stays exactly as it is. Your bot story converges to one HTML response.

Verified bots only

Reverse-DNS + IP-range validation against every vendor's published bot list. UA-only matching is a footgun — we don't ship it.

Same content as your users

The rendered HTML is exactly what your hydrated SPA would show — no special pleading, no drift. Compatible with Google's 2024 guidance and AI crawlers' 2026 reality.

Edge-cached snapshots

Snapshots warm proactively from your sitemap and live in your existing Fastly or Cloudflare cache. Crawlers get edge-cache-speed responses on hit; origin load drops; no SSR migration required.

Granular per-bot policy

Block training bots, allow live-retrieval bots, allow search-index bots — independently, per UA, by reverse-DNS. The right policy for AI shopping visibility in 2026.

Observability built in

Every render emits a structured event. Hit rate, error rate, render time, and per-bot success — straight into Elasticsearch / Grafana / whatever you already run.

AI-friendly source

All config is plain JSON + VCL + Markdown. No bespoke DSL, no opaque admin UI. Designed for iterative AI-led development: any change is one PR, one diff, one human review.

02 · How it works

One request flow. Two outcomes. Same content.

A request hits your existing CDN. Crawlers are routed to a Puppeteer service that runs your real JavaScript; everyone else gets your SPA shell as normal. Both paths produce the same content — they just produce it in different places.

STEP 01

Edge detect

VCL classifies the request: real user → SPA path; verified bot → prerender path. Verification is rDNS + IP-range, never UA-only.

STEP 02

Render or cache

Cache hit → edge-cache response, no compute. Cache miss → Puppeteer renders the page using your real SPA + JS bundle, then caches the HTML for next time.

STEP 03

Serve fully-formed HTML

The bot gets the same DOM your customers would see after hydration — title, meta, JSON-LD, Product / Offer schema, full body — in one fetch.

STEP 04

Warm + measure

A nightly crawler keeps the snapshot fresh against your sitemap. Every render emits a metric line. Drift between bot HTML and user HTML is alarmed, not tolerated.

03 · Why now

The 2026 reality the deprecation didn't foresee.

Google deprecated dynamic rendering in 2024 on the assumption that everyone would migrate to SSR. Most didn't — and AI crawlers showed up that don't execute JavaScript at all. The technique came back under new names. We just write down honestly what the trade-offs are.

04 · Writing

Recent posts

PRACTICE

Stuck with legacy — fix Product schema, canonicals, soft 404s at the edge

When the CMS can't ship modern SEO + AI metadata, the prerender layer can. Per-bot response shaping with the cloaking-risk safety rails worked out.

12 min · 2026-05-19
OBSERVABILITY

What to log when you serve bots — schema + Combot.ai integration

Structured-log schema, Vector → ES → Combot pipeline, five anomaly alerts with thresholds, GDPR PII handling.

13 min · 2026-05-19
ENGINEERING

Reverse-DNS bot verification — recipes for seven platforms

nginx, Cloudflare, Fastly, Vercel, AWS, Apache + OpenResty Lua. The three-step protocol, the vendor IP-range JSON sources, the operational tips most posts skip.

14 min · 2026-05-19
POSITION

Don't block GPTBot if you're a brand

The block-training framework was written for publishers. For brands, the math inverts. The parametric-recall mechanism, the 2026 numbers, the strongest counter-argument addressed.

11 min · 2026-05-19
DEEP DIVE

The strange afterlife of dynamic rendering, 2018–2026

Google deprecated it. The AI crawlers brought it back. A history of a technique that refused to die — with the 2027 forecast.

Long read · 2026-05-19
RESEARCH

62 of 100 — what we learned auditing the world's largest stores

The single biggest finding from our May 2026 audit: most large e-commerce sites have voluntarily made themselves invisible to AI shopping.

Findings · 2026-05-19
DECISION

Should you block AI bots? A 3-step decision framework

Training vs live retrieval vs search index. They're not the same thing. How to write a robots.txt that blocks the first and welcomes the second.

How-to · 2026-05-19
BEHIND THE SCENES

Building PrerenderProxy in 2026

Why we chose Puppeteer over Playwright, what Fastly VCL lets us do that Cloudflare doesn't, and the operating cost of a self-hosted prerender layer.

Engineering · 2026-05-19

All writing →  ·  Bot Directory →

05 · Get in touch

Want the full audit + a same-site benchmark?

We send our full 100-site audit dataset (every site × every bot, raw HTML, screenshots) and a one-page benchmark of your own site against the cohort. No pricing pages yet — we're picking the first ten partners by hand.