AI Crawlers: The Complete Guide

GPTBot, ClaudeBot, PerplexityBot and how to handle them

AI companies are crawling the web to train and power their models. Understanding these crawlers helps you control how your content is used.

Known AI Crawlers

Crawler Company Purpose
GPTBot OpenAI Training & live retrieval
ChatGPT-User OpenAI ChatGPT browsing feature
ClaudeBot Anthropic Claude training
Anthropic-AI Anthropic Claude training
PerplexityBot Perplexity Search answers
Cohere-AI Cohere Model training
Google-Extended Google Gemini training
CCBot Common Crawl Open dataset (used by many)

Should You Allow or Block?

๐Ÿค” The Trade-off

Allow: Your content gets included in AI responses โ†’ more visibility

Block: Your content stays private โ†’ less AI visibility

Allow If You Want:

Block If You Want:

robots.txt Configuration

Allow All AI Crawlers (Recommended for AEO)

User-agent: *
Allow: /

# AI Crawlers Welcome
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Anthropic-AI
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

Sitemap: https://yoursite.com/sitemap.xml

Block All AI Crawlers

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Anthropic-AI
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

Selective Access

# Allow browsing, block training
User-agent: ChatGPT-User
Allow: /

User-agent: GPTBot
Disallow: /

# Allow Perplexity (good for visibility)
User-agent: PerplexityBot
Allow: /

# Block others
User-agent: ClaudeBot
Disallow: /

Beyond robots.txt

robots.txt is advisory โ€” crawlers can ignore it. For stronger control:

Our Recommendation

For most businesses, allowing AI crawlers is beneficial:

  1. More visibility in AI-powered search
  2. Higher chance of being recommended
  3. Part of the AI knowledge base
  4. Better GEO and AEO scores

Unless you have specific privacy or licensing concerns, welcome the bots.

๐Ÿค– Check Your AI-Readiness

See if your robots.txt is welcoming AI crawlers

Check AEO Score

Further Reading