Robots.txt
Robots.txt
Section titled “Robots.txt”Use this page to compare robots.txt evidence across captured sites.
Crawler policy surfaces that tell crawlers which paths and user agents are allowed.
A matching status is a lead, not proof; content type, body shape, redirects, and truncation still matter.
Crawler policy surfaces that tell crawlers which paths and user agents are allowed.
| Site | Host | Matching evidence |
|---|---|---|
| Cloudflare Developers | developers.cloudflare.com | robots.txt 200 |
| Perplexity Docs | docs.perplexity.ai | robots.txt 200 |
| Model Context Protocol | modelcontextprotocol.io | robots.txt 200 |
| Claude Platform | platform.claude.com | robots.txt 200 |
| Vercel | vercel.com | docs robots 200 |
| OpenAI API Docs | developers.openai.com | docs robots.txt 404 |
| GitHub Docs | docs.github.com | robots.txt 200 |
| Stripe Docs | docs.stripe.com | robots.txt 200 |
| LangChain Docs | docs.langchain.com | robots.txt 200 |
| Cloudflare Root | www.cloudflare.com | robots.txt 200 |
| Google Developers | developers.google.com | robots.txt 200 |
| Google AI | ai.google.dev | robots.txt 200 |
| OpenAI Root | openai.com | robots.txt 200 |
| Anthropic Root | www.anthropic.com | robots.txt 200 |
| Perplexity Root | www.perplexity.ai | robots.txt 200 |
| Supabase Docs | supabase.com | docs robots.txt 200 |
| LlamaIndex Docs | docs.llamaindex.ai | robots.txt 200 |
| Cursor Docs | docs.cursor.com | robots.txt 200 |