Robots.txt
Generator
Set crawler rules visually and download a ready-to-use robots.txt file instantly.
Crawler Rules
User-agent: * Disallow: /admin/
Upload this file to your website root: https://yourdomain.com/robots.txt
Frequently Asked Questions
What is a robots.txt file?▾
What is the difference between Allow and Disallow?▾
What is Crawl-delay?▾
Where should I upload my robots.txt?▾
Control Crawlers with robots.txt
A misconfigured robots.txt can devastate your SEO. Learn to configure it correctly with this complete guide.
What Is robots.txt?
robots.txt is a text file that instructs web crawlers (bots) which pages of your site they may or may not access. Placed at the site root (example.com/robots.txt) and written using the Robots Exclusion Protocol. Major crawlers like Googlebot and Bingbot respect this file.
Basic robots.txt Syntax
- User-agent: Target crawler to address (* = all crawlers)
- Disallow: URL path to block from crawling
- Allow: Allow specific paths within a Disallow rule
- Sitemap: Point crawlers to your sitemap URL (recommended)
- Crawl-delay: Delay between requests in seconds (some crawlers only)
Critical Configuration Mistakes
The most dangerous mistake is Disallow: / which blocks your entire site. Blocking CSS/JS files prevents Google's rendering, lowering your Core Web Vitals score. Always test your robots.txt before deploying.
robots.txt Examples
Basic (allow all crawlers)
User-agent: * Allow: / Sitemap: https://example.com/sitemap.xml
Block admin pages
User-agent: * Disallow: /admin/ Disallow: /wp-admin/ Allow: /
Block specific bot
User-agent: GPTBot Disallow: / User-agent: * Allow: /
robots.txt & Sitemap Integration
Adding a Sitemap directive to robots.txt helps Googlebot find your sitemap without relying on a snippet. Consider also submitting your sitemap directly in Google Search Console for faster indexing.
Handling AI Crawlers
Since 2023, AI training crawlers like OpenAI's GPTBot, Google-Extended, and Meta's crawler have proliferated. If you want to prevent AI training on your content, block them with their specific User-agent values in robots.txt.
Verify with Google Search Console
Google Search Console has a built-in robots.txt tester that shows whether specific URLs would be blocked. Always test before deploying to production to ensure no typo accidentally blocks important pages.
/admin/Exclude admin from index
/wp-admin/Keep WordPress admin private
/search?Prevent duplicate search query indexing
/cdn-cgi/Exclude Cloudflare CDN paths
/print/Exclude print pages (prevent duplicate content)