PerplexityBot
Perplexity crawler that indexes the public web.
What does PerplexityBot do?
PerplexityBot crawls and indexes web pages to power Perplexity's AI search and answer product. When users ask questions on Perplexity, the system draws on this index to generate answers with numbered, clickable citations linking back to source pages. Allowing PerplexityBot can drive direct referral traffic to your site through these citation links.
Should I allow and optimize for PerplexityBot to drive organic growth?
Perplexity is one of the fastest-growing AI search products, and every answer it generates includes numbered citation links back to source pages. Allowing PerplexityBot gives your content a chance to appear in these cited answers, driving direct referral traffic. Blocking the bot removes your pages from Perplexity's index entirely (though domain names and headlines may still appear in limited form). If you want visibility in AI-powered search, keeping PerplexityBot allowed is one of the highest-value decisions you can make.
Here's how to optimize for PerplexityBot:
- Allow PerplexityBot in your robots.txt to ensure full indexing
- Add a Sitemap directive in robots.txt so PerplexityBot can discover all your pages efficiently
- Use clear, descriptive page titles and meta descriptions since Perplexity may display these even for blocked pages
- Include structured data (JSON-LD) to help the crawler understand your content's context and relationships
- Ensure key content is in the initial HTML rather than loaded entirely via JavaScript, given partial JS rendering support
- Keep server response times fast to avoid timeouts during crawls
- Publish authoritative, well-sourced content that Perplexity's ranking system is likely to cite
Data Usage & Training
Content crawled by PerplexityBot is not used to pre-train foundation models. Perplexity states the data is indexed solely to support its search and answer product. Perplexity also says contractual terms prohibit third-party model vendors from training on Perplexity data. A separate agent, Perplexity-User, handles on-demand fetches triggered by individual user queries.
How PerplexityBot Accesses Content
Here's how PerplexityBot accesses your site and understands your content:
- Fetches HTML via standard HTTP requests using the user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
- Partial JavaScript rendering support
- Respects robots.txt Disallow and Allow directives
- Recognizes Sitemap directives in robots.txt
- Published IP ranges available at https://perplexity.ai/perplexitybot.json for verification
- Even when a page is blocked via robots.txt, Perplexity may still index the domain, headline, and a brief factual summary
PerplexityBot crawls continuously to build and refresh its search index. User-initiated fetches happen separately through the Perplexity-User agent when someone asks a question that triggers a live page retrieval.
How to Block or Control PerplexityBot
To block PerplexityBot via robots.txt:
User-agent: PerplexityBot
Disallow: /
For IP-based blocking, use the official IP ranges published at https://perplexity.ai/perplexitybot.json. Verify requests by matching both the user-agent string and source IP against these published prefixes. Be aware that blocking via robots.txt may not fully suppress your site from Perplexity results; the domain, headline, and a brief factual summary can still appear. Crawl-delay is not supported. To block user-initiated fetches (Perplexity-User), you may need authentication or additional access controls since those requests are triggered by real users.
Common Issues & Troubleshooting
Watch out for these common problems when working with PerplexityBot:
- Occasional IPs fall outside published ranges, making stale IP-based rules ineffective. Update your IP lists regularly from the official JSON endpoint.
- User-agent string variants have been observed, so UA-only blocking may miss some requests.
- Blocked pages can still appear in Perplexity results with domain name, headline, and a brief summary.
- CloudFlare and other WAF services may need custom rules combining both UA and IP matching for reliable blocking.
Perplexity-User(the on-demand fetcher) is a separate agent and requires its own robots.txt rules or access controls.- Crawl-delay directives are not supported, so you cannot throttle
PerplexityBotthrough robots.txt alone.
Quick Reference
perplexitybotUser-agent: perplexitybot
Disallow: /See which agents visit your site
Monitor real-time AI agent and bot activity on your site for free with Siteline Agent Analytics
Frequently Asked Questions
Similar Agents & Bots
Learn More
Related Resources
Ready to track PerplexityBot on your site?
Start monitoring agent traffic, understand how AI discovers your content, and optimize for the next generation of search.



