Building PrerenderProxy in 2026

People sometimes ask why we built this in-house in 2025 instead of running Prerender.io or migrating the underlying sites to Next.js SSR. The boring answer is "we already had the infrastructure". The interesting answer is that the choices we made then are still defensible eighteen months in, and a few of them are non-obvious enough to write down.

The stack, in one diagram

The dotted line is the user path (unchanged). The solid green path is the crawler path: edge classifies, hits cache, falls back to a Puppeteer render of the real SPA when needed. A nightly warmer keeps the cache fresh.

Why Fastly VCL and not Cloudflare Workers

Cloudflare Workers is the cleaner runtime for new edge code in 2026 — Wasm-native, V8 isolates, a sane KV story. We did not use it. Three reasons, in order of importance:

Restart-based debugging. Fastly VCL is text. You edit it, you push it, you can read the diff in your inbox. Cloudflare Worker debugging happens inside the Worker. Our sites are run by ops teams who do not want to ship JavaScript to debug a routing decision. VCL is the worst language we will tolerate and it's still better than another Worker for the edge-routing problem.

Shielding by default. Fastly shields are first-class — one edge POP fetches from origin, the rest fetch from that POP. For a prerender layer this matters: a snapshot rendered for Frankfurt should not be re-rendered for Helsinki on cache miss. The shield model maps directly onto our cache-warming model. Workers can be wired to do the same thing with Cache API + Durable Objects but you have to build it.

Customer constraint. Most of our customers already had Fastly contracts. The cost of training their ops teams on a new edge runtime exceeded the cost of writing slightly more VCL.

Why Puppeteer and not Playwright

This was the closest call in the stack. Playwright is the better-engineered framework on every axis except one: in 2025 Puppeteer had the more mature ecosystem of stealth plugins (puppeteer-extra-plugin-stealth and friends), which mattered because some target sites fingerprint headless browsers and would have flagged our render fleet otherwise. Playwright has caught up in 2026; if we were starting fresh today the choice would tip the other way.

The bigger architecture decision was running a fleet of long-lived containers rather than spawning Chrome per request. Cold-starting a headless Chrome process costs hundreds of milliseconds; keeping a small pool of containers warm with several page slots per container is what lets you serve cache-miss renders inside a normal SEO-crawl timeout window. Chromium's memory growth under continuous use is real — the operating practice we settled on is a periodic restart on a combined render-count and elapsed-time schedule, tuned per host once you see your actual memory curve.

Cache warming is the unglamorous lever

The single biggest improvement we made in 2025 was a small sitemap crawler called ppcrawl. It walks each customer's sitemap on a 24-hour cycle, requests every URL through our edge with an internal bot header, and pre-populates the snapshot cache. The economics flip when your hit rate goes from 60 % to 95 %:

cache hit  →  edge-cache response, $0 marginal compute
cache miss →  Puppeteer render path, fractional cents in compute

The warmer also produces a built-in canary: if a sitemap URL renders empty, if the title changes overnight, if a Cloudflare rule starts rejecting our renderer's IP — we know about it before Googlebot does. Drift between bot HTML and user HTML is the single failure mode that gets dynamic rendering banned. Our policy is to alarm on it rather than tolerate it.

What we'd do differently

Three things we got wrong the first time around.

We over-trusted UA-only matching for too long. The first six months of production used straight UA regex in VCL. A single customer being abused by scrapers identifying as Googlebot moved us to reverse-DNS verification. We should have shipped rDNS from day one. Every UA-only match is a footgun; we now treat them as code smells.

The default viewport was very tall. The original Puppeteer config used a ~10 000-px viewport on the theory that "Googlebot does this to trigger lazy-loaded content". That was a documented practice in the 2018 era and is increasingly less necessary in 2024–2026: Google's WRS handles IntersectionObserver and ordinary lazy-load patterns. The newer default we recommend is a modest viewport (1280–1920 wide × 4 000–8 000 tall) and explicit scroll-step automation only on routes that genuinely need it. Render time falls noticeably; cache hit rates do not change.

We undersold the value of structured data. When AI shopping queries started showing up in 2025 the sites with strong JSON-LD on every PDP outperformed sites with the same content but no schema by a meaningful margin. The lesson: rendering the HTML is necessary; rendering the right HTML matters more. The default Puppeteer render now explicitly validates Product / Offer JSON-LD is in the output.

The operating cost

The honest answer for self-hosted: infrastructure cost runs in the ~$50–$200 / month range for a small-to-medium site (the same ballpark our internal docs settle on), driven primarily by the headless-Chrome compute fleet, a modest CDN bill for the snapshot cache, and storage for the archive. Commercial alternatives like Prerender.io start at $90 / month for entry-level plans and scale up with page-count tiers. Self-hosting is the cheaper option only when the team and the CDN contract already exist; for customers with neither, Prerender.io or a Vercel / Cloudflare edge-rendering integration is the correct first choice — fewer moving parts to debug.

What we're working on next

Three things on the roadmap that came directly out of the May 2026 audit:

A managed "AI-bot policy" config block that splits training / search / live retrieval automatically — so teams don't replicate the WAF mistake we found on 62 of the 100 audited sites.
Per-bot rendering profiles — Googlebot still wants the JSON-LD-rich version; PerplexityBot answers shopping queries better with breadcrumbs prominent; ChatGPT-User benefits from Open Graph cards being inline. The rendered HTML can adapt to who is asking, while remaining the same content as the user version.
A continuous "did anything change today" canary across the 100-site audit cohort. If Cloudflare ships a managed-rule update overnight, we want our customers to know within hours, not weeks.

The thread running through all three: rendering HTML is the easy part. Knowing what HTML to render — and when to alarm because the answer changed — is the work.

If you want to compare your site's bot-readability against the 100-site cohort: request the audit.