ExaBot
Exa agent fetching pages at user request for search and research.
What does ExaBot do?
ExaBot is the web crawling agent used by Exa to fetch pages, extract structured content and dense highlights, and build the indexes behind Exa Search and the Exa Search API. These indexes power search results, agent-facing products like exa-code (web context for coding agents), and various integrations that provide grounded context to AI agents. Exa's search results include citation links and source URLs, so allowing ExaBot can drive referral traffic back to your site.
Should I allow and optimize for ExaBot to drive organic growth?
ExaBot feeds Exa Search and the Exa Search API, both of which return results with citation links and source URLs. When AI agents or users query Exa, your content can appear as a grounded source with a direct link back to your site. Exa is increasingly used as a context layer for AI coding agents and research tools, giving your content exposure across a growing ecosystem of AI-powered products. Allowing ExaBot keeps your pages indexed and eligible to appear in these results.
Here's how to optimize for ExaBot:
- Allow exabot in your robots.txt to ensure your pages are indexed by Exa
- Use clean, semantic HTML so ExaBot can extract structured highlights effectively
- Include descriptive meta titles and descriptions to improve how your content is summarized
- Add structured data (JSON-LD) to help ExaBot understand page content and relationships
- Keep important content in the initial HTML rather than loading it entirely via JavaScript
- Ensure fast server response times to avoid timeouts during crawls
Data Usage & Training
It is unclear whether content crawled by ExaBot is used to train AI models. Exa's public terms grant broad rights to use ingested content to provide and improve services, which could include model development. However, no explicit public policy confirms or denies training use. If this concerns you, contact [email protected] for clarification.
How ExaBot Accesses Content
Here's how ExaBot accesses your site and understands your content:
- Fetches HTML pages via standard HTTP requests
- Extracts dense, token-efficient highlights and structured content from pages
- Identifies itself with the user-agent substring
exabot - No published IP ranges or canonical full user-agent string
- JavaScript rendering capability is unknown
Not explicitly documented, but Exa's product materials describe a continuously updated indexing system designed for near-real-time retrieval. Expect ongoing crawl activity rather than scheduled batches.
How to Block or Control ExaBot
To block ExaBot via robots.txt, add a rule targeting its user-agent substring:
User-agent: exabot
Disallow: /
Exa does not publish a dedicated robots.txt token beyond exabot, and their own robots.txt uses User-agent: * rules. If you need finer control, filter requests by the exabot user-agent substring at the application or WAF layer. Exa does not publish IP ranges, so IP-based blocking is unreliable. For verification or allowlist requests, contact [email protected].
Common Issues & Troubleshooting
Watch out for these common problems when working with ExaBot:
- No published IP ranges makes IP-based blocking or allowlisting unreliable
- Exa's robots.txt uses User-agent: * rather than a dedicated token, so site owners must add explicit
exabotrules - If the crawler rotates user agents or uses multiple fetchers, simple UA-based blocking may not catch all requests
- JavaScript-rendered content may not be fully indexed since
ExaBot's rendering capabilities are unknown - No documented Crawl-delay support, so rate limiting may require server-side configuration
Quick Reference
exabotUser-agent: exabot
Disallow: /See which agents visit your site
Monitor real-time AI agent and bot activity on your site for free with Siteline Agent Analytics
Frequently Asked Questions
Similar Agents & Bots
ApifyWebsiteContentCrawler
Apify actor that crawls websites and extracts text content for AI models, LLM apps, and RAG pipelines.
ChatGPT-User
OpenAI browsing agent fetching pages at user request.
Claude-User
User-initiated fetches triggered by Claude sessions.
DuckAssistBot
DuckDuckGo assistant fetching content for answers.
Learn More
Related Resources
Ready to track ExaBot on your site?
Start monitoring agent traffic, understand how AI discovers your content, and optimize for the next generation of search.



