GPTBot
Collects public web pages to train future OpenAI foundation models. Distinct from OAI-SearchBot (search index) and ChatGPT-User (live retrieval) — blocking GPTBot opts you out of training only, not out of ChatGPT visibility.
Specs
| Vendor | OpenAI |
| Category | MEMORY |
| robots.txt token | GPTBot |
| Renders JavaScript | HTTP only |
| Honors robots.txt | yes |
| Reverse-DNS pattern | *.openai.com |
| IP-range source | https://openai.com/gptbot.json |
User-Agent string
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbotConsiderations
- Does not execute JavaScript. A CSR-only SPA is invisible to GPTBot regardless of how rich the hydrated DOM is.
- GPTBot/1.1 ships with slightly more polite rate-limiting than the original 1.0 release; both UAs are still in circulation.
- Block this UA if your stance is anti-training-data. Allow it if you want your content to shape future models.
robots.txt recipe
User-agent: GPTBot
Disallow: /
Sources: OpenAI · GPTBot reference · OpenAI · Crawler overview