Robots.txt AI Bot Checker: see which AI crawlers you allow

Paste your domain and we read your robots.txt, then show — bot by bot — whether you allow or block GPTBot, ClaudeBot, Google-Extended, PerplexityBot and every other major AI crawler. Find out if you're visible to AI search before your competitors do. No sign-up.

Domain or URL to check

Free, no sign-up. We read your robots.txt and show which AI crawlers — GPTBot, ClaudeBot, Google-Extended, PerplexityBot and more — you currently allow or block.

Why AI bots in your robots.txt matter for visibility

Your robots.txt is the first thing a crawler reads, and AI companies now run their own crawlers with their own user-agent names. OpenAI alone uses GPTBot for training, OAI-SearchBot for ChatGPT Search, and ChatGPT-User for on-demand fetches. Anthropic, Google, Perplexity, Common Crawl and others each have their own. A single Disallow rule decides whether your content can feed and be cited by these systems.

Get it wrong in either direction and it costs you: block the search crawlers and your brand vanishes from AI answers; leave training crawlers open when you meant to opt out and your content trains models for free. A quick check tells you exactly where you stand across every major AI bot.

How to read your result

Allowed

The crawler can access your site root. For AI search bots like OAI-SearchBot, ClaudeBot and PerplexityBot, this is what keeps you eligible to be cited in AI answers.

Partial

The crawler can reach your site, but your robots.txt disallows some paths for it. Usually fine — just confirm you're not hiding pages you want surfaced in AI search.

Blocked

A Disallow: / rule stops this crawler at the door. Intentional for training opt-out, but a problem if it's a search crawler you wanted to stay visible to.

Common robots.txt mistakes — and how to fix them

Blocking AI search by accident.

A blanket Disallow that catches OAI-SearchBot or PerplexityBot quietly removes you from AI answers. Allow the search crawlers; block only training bots if you must.

Relying on User-agent: * for AI.

Many AI crawlers ignore the wildcard group and only obey a rule that names their exact token. Target each AI bot by its specific user-agent.

Confusing Google-Extended with Googlebot.

Blocking the wrong token either leaves AI training on, or accidentally deindexes you from Search. Use Google-Extended for AI, Googlebot for Search.

Trusting robots.txt as a firewall.

Robots.txt is advisory — it won't stop crawlers that choose to ignore it or scrape via third parties. Use server-side blocking for bots you must hard-stop.

Track your brand across AI answers

Allowing AI crawlers is step one. SEOcrawl's AI Tracker shows what happens next: it monitors how often ChatGPT, Claude, Gemini and Perplexity actually mention and cite your brand, which prompts trigger you, and how you stack up against competitors — all alongside your Google Search Console data in one place.

Try SEOcrawl free →

AI Tracker →What is llms.txt →See pricing →

FAQs

What is an AI bot checker?

An AI bot checker reads a site's robots.txt file and tells you which AI crawlers it currently allows or blocks. It checks the user-agent tokens of the major AI companies — OpenAI (GPTBot, OAI-SearchBot, ChatGPT-User), Anthropic (ClaudeBot, Claude-SearchBot), Google (Google-Extended), Perplexity (PerplexityBot), Common Crawl (CCBot) and others — against the Allow and Disallow rules in your robots.txt.

How do I block AI crawlers in robots.txt?

Add a group per crawler with a Disallow rule, e.g. "User-agent: GPTBot" followed by "Disallow: /". To block several, list each user-agent in its own group. Remember that robots.txt is advisory: well-behaved crawlers like GPTBot and ClaudeBot honour it, but it isn't an enforcement mechanism, so it won't stop bots that choose to ignore it.

Should I block AI bots or allow them?

It depends on your goal. Blocking training crawlers (GPTBot, CCBot, Google-Extended) opts your content out of model training. But blocking AI search crawlers (OAI-SearchBot, ClaudeBot, PerplexityBot) can keep your brand out of ChatGPT, Claude and Perplexity answers, costing you visibility and referral traffic. Many sites allow search crawlers while blocking training-only ones.

Does blocking Google-Extended hurt my Google rankings?

No. Google-Extended only controls whether your content is used to train and ground Gemini and Vertex AI. It is separate from Googlebot, so blocking Google-Extended has no effect on how you rank in Google Search. It's the clean way to opt out of AI training without touching organic search.

What's the difference between training, search and on-demand AI bots?

Training bots (GPTBot, CCBot, Google-Extended, Bytespider) scrape content to train models. Search bots (OAI-SearchBot, Claude-SearchBot, PerplexityBot) index your site so it can be cited in AI search answers. On-demand fetch bots (ChatGPT-User, Claude-User, Perplexity-User) retrieve a single page in real time when a user asks the assistant about it. Blocking each has very different consequences for AI visibility.

More Free SEO Tools

SERP Simulator

Sitemap Finder & Checker

Schema Validator

llms.txt Generator

Title Tag Checker

Canonical Tag Checker