GPTBot — User-agent, IPs, robots.txt · PrerenderProxy Bot Directory

Collects public web pages to train future OpenAI foundation models. Distinct from OAI-SearchBot (search index) and ChatGPT-User (live retrieval) — blocking GPTBot opts you out of training only, not out of ChatGPT visibility.

Specs

Vendor	OpenAI
Category	MEMORY
robots.txt token	`GPTBot`
Renders JavaScript	HTTP only
Honors robots.txt	yes
Reverse-DNS pattern	`*.openai.com`
IP-range source	https://openai.com/gptbot.json

User-Agent string

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot

Considerations

Does not execute JavaScript. A CSR-only SPA is invisible to GPTBot regardless of how rich the hydrated DOM is.
GPTBot/1.1 ships with slightly more polite rate-limiting than the original 1.0 release; both UAs are still in circulation.
Block this UA if your stance is anti-training-data. Allow it if you want your content to shape future models.

robots.txt recipe

User-agent: GPTBot
Disallow: /

Sources: OpenAI · GPTBot reference · OpenAI · Crawler overview

← Back to directory