BOT INFO

CabooBot identifies business facts for requested AI visibility scans.

CabooBot is the crawler used by Caboo when a business owner asks us to scan their website and generate a visibility report. It reads public pages to understand the business name, services, location, structured data, sitemap, robots policy, and crawler readiness.

User agent

CabooBot/0.1 (+https://getcaboo.com/bot)

What it fetches

For the free scan, the bot fetches the homepage, robots.txt, sitemap.xml, and favicon.ico. It also looks for llms.txt for completeness, but llms.txt is informational only — it does not contribute to the Caboo Score. It uses a short timeout and a small byte limit. It does not crawl the whole site in the free scan.

Provider crawlers Caboo monitors

Beyond fetching the site itself, Caboo evaluates whether the major AI providers' own crawlers can reach the site. Caboo does not impersonate these bots — it reads the site's robots policy and reports which of them are allowed.

OAI-SearchBot       OpenAI · ChatGPT Search index
ChatGPT-User        OpenAI · live user-triggered fetches
GPTBot              OpenAI · model training
Claude-SearchBot    Anthropic · Claude search index
Claude-User         Anthropic · live user-triggered fetches
ClaudeBot           Anthropic · model training
Google-Extended     Google · Gemini API grounding (separate from Googlebot)
PerplexityBot       Perplexity · search index

Blocking OAI-SearchBot, Claude-SearchBot, Google-Extended, or PerplexityBot typically removes a site from that provider's AI-search surface. Blocking the training-only bots (GPTBot, ClaudeBot) is a different decision.

Robots policy

CabooBot respects robots.txt for unverified domains. If a verified site owner wants a deeper scan, they can allow CabooBot explicitly or verify domain ownership before requesting an override.

User-agent: CabooBot
Allow: /

How to block it

User-agent: CabooBot
Disallow: /

Contact

Questions or crawl issues: [email protected].