Semrush On Page SEO Checker
On-page SEO checker crawler.
What does Semrush On Page SEO Checker do?
SemrushBot-SI crawls your pages to collect on-page content and SEO signals for Semrush's On Page SEO Checker and SEO Content Template reports. It feeds data into a broader suite of Semrush tools including Site Audit, Backlink Analytics, Link Building, and SEO Writing Assistant. It can drive indirect referral traffic: Semrush surfaces your URLs, backlink data, and competitor comparisons in product reports that paying Semrush users click through.
Should I allow and optimize for Semrush On Page SEO Checker to drive organic growth?
Allowing SemrushBot-SI helps your site appear accurately in Semrush's On Page SEO Checker, SEO Content Template, and related tools. Semrush users see your URLs in competitor analysis reports, backlink suggestions, and outreach lists, which can generate indirect referral traffic when those users click through. Blocking the bot won't hurt your Google rankings directly, but it may cause your site to show incomplete or outdated data in Semrush, reducing your visibility to marketers and SEO professionals who use the platform for research and link building.
Here's how to optimize for Semrush On Page SEO Checker:
- Allow SemrushBot-SI in your robots.txt to ensure accurate representation in Semrush reports
- Ensure your robots.txt is at the host root and returns HTTP 200 (a 4xx response is treated as no robots.txt)
- Use semantic HTML and descriptive title tags so Semrush can accurately parse on-page signals
- Include structured data markup to improve how your content is categorized in Semrush tools
- Set Crawl-delay to a reasonable value (under 10 seconds) if you need to throttle without fully blocking
- Avoid exposing sensitive parameters via GET forms, as Semrush may crawl those URLs
Data Usage & Training
Semrush uses crawled content to power its SEO tools and generate reports. It is unclear from public documentation whether crawled content is also used for AI model training. If this concerns you, contact [email protected] for clarification or block the bot via robots.txt.
How Semrush On Page SEO Checker Accesses Content
Here's how Semrush On Page SEO Checker accesses your site and understands your content:
- Fetches pages via HTTP using the user-agent string Mozilla/5.0 (compatible; SemrushBot-SI/0.97; +http://www.semrush.com/bot.html)
- Supports full JavaScript rendering
- Respects robots.txt Disallow, Allow, and wildcard patterns
- Supports a non-standard Crawl-delay directive (maximum enforced interval of approximately 10 seconds)
- Adapts crawl frequency based on server load
- Requires robots.txt at the host root returning HTTP 200 to be recognized
Adaptive and scheduled. Semrush revisits URLs based on an adaptive crawl frontier and adjusts frequency according to server load. Changes to your robots.txt can take up to one hour or approximately 100 requests to propagate.
How to Block or Control Semrush On Page SEO Checker
To block SemrushBot-SI specifically, add the following to your robots.txt: User-agent: SemrushBot-SI Disallow: / To block all Semrush crawlers, use: User-agent: SemrushBot Disallow: / Semrush operates multiple bot tokens for different tools: SemrushBot, SemrushBot-SI, SiteAuditBot, SemrushBot-BA, SemrushBot-SWA, SemrushBot-OCOB, SemrushBot-FT, SemrushBot-ESI, SplitSignalBot, and RyteBot. You can target specific tokens to block only certain tools. IP-based blocking is unreliable because Semrush does not publish consecutive IP blocks. For persistent issues, contact [email protected].
Common Issues & Troubleshooting
Watch out for these common problems when working with Semrush On Page SEO Checker:
- Robots.txt returning 4xx is treated as no robots.txt, meaning the bot will crawl freely
- Robots.txt returning 5xx causes Semrush to stop crawling entirely, which may not be your intent
- Changes to robots.txt can take up to one hour or approximately 100 requests to take effect
- IP-based blocking is unreliable because Semrush does not publish consecutive IP ranges
- GET forms may expose query parameters in crawled URLs, potentially surfacing unintended pages
- Blocking SemrushBot at the top level does not necessarily block tool-specific tokens like SemrushBot-SI or SiteAuditBot
Quick Reference
semrushbot-siUser-agent: semrushbot-si
Disallow: /See which agents visit your site
Monitor real-time AI agent and bot activity on your site for free with Siteline Agent Analytics
Frequently Asked Questions
Similar Agents & Bots
Learn More
Related Resources
Ready to track Semrush On Page SEO Checker on your site?
Start monitoring agent traffic, understand how AI discovers your content, and optimize for the next generation of search.


