Stax

robots.txt Generator

Generate a robots.txt file with bot blocking and sitemap URL.

User-agent: *
Disallow: /admin
Disallow: /api
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

Sitemap: https://example.com/sitemap.xml

Place this file at the root of your domain: https://yourdomain.com/robots.txt

Control what search engines and AI bots can crawl

robots.txt is your first line of defence for managing crawler access. Block AI training bots, protect private pages, set crawl rate limits, and point crawlers to your sitemap — all with a few clicks.

Frequently asked questions

What is robots.txt?
robots.txt is a text file at the root of your website that tells search engine crawlers which pages or paths they are allowed or disallowed from crawling. It is a voluntary standard — well-behaved bots respect it, but malicious bots may ignore it.
Can I block AI training bots with robots.txt?
Yes. Major AI companies respect robots.txt disallow rules. GPTBot (OpenAI), CCBot (Common Crawl used for training), anthropic-ai (Anthropic), and Google-Extended (Google Bard) all respect Disallow: / for their user agents.
Does Disallow: / block all bots?
Disallow: / for a specific User-agent blocks that bot from crawling your entire site. To block all bots: set User-agent: * followed by Disallow: /. However, blocking Googlebot will prevent your site from being indexed in Google.
What is Crawl-delay?
Crawl-delay tells the bot to wait a specified number of seconds between requests. This prevents aggressive crawlers from overloading your server. Note: Googlebot ignores Crawl-delay — use Google Search Console's crawl rate settings instead.
Should I add my sitemap to robots.txt?
Yes. Adding a Sitemap: directive at the bottom of robots.txt tells all crawlers (not just Google) where your sitemap lives. This is in addition to submitting it in Google Search Console.

Related tools