PerplexityBot
| Vendor | Perplexity AI |
| Type | Search-index crawler |
| robots.txt token | PerplexityBot |
| JavaScript rendering | No — HTTP-only |
| Honors robots.txt | Partial — has been observed ignoring robots.txt directives in 2024 |
| Vendor docs | docs.perplexity.ai/guides/bots |
User-Agent strings
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
Purpose
Builds the index Perplexity's search-grounded answers draw from. Pages crawled by PerplexityBot are eligible to appear as cited sources in Perplexity's answer cards.
Network identity
- Hostname pattern:
*.perplexity.ai - IP list: published in Perplexity's docs; updates less frequent
than OpenAI's or Anthropic's.
In our audit
Blocked at 62/100 sites as part of the AI-bot cluster. Where allowed, received the same content as the other AI bots — Perplexity's bot infrastructure does not perform any special handshake.
At amazon.co.uk, shopping.yahoo.co.jp, and the other dynamic-rendering sites, PerplexityBot is not in the trusted-crawler allowlist — it receives the small shell while Googlebot/Bingbot/Applebot receive the pre-rendered version.
How to allow / block
To allow indexing:
User-agent: PerplexityBot
Allow: /
To block:
User-agent: PerplexityBot
Disallow: /
Quirks
- 2024 controversy: Perplexity was reported to fetch content from
sites that had disallowed PerplexityBot in robots.txt, by routing requests through residential IPs with a generic Chrome UA. Perplexity has since updated its policies, but the incident left trust friction with some publishers.
- HTTP-only for indexing. Live retrieval uses
Perplexity-User
(next file).