CabooBot identifies business facts for requested AI visibility scans.
CabooBot is the crawler used by Caboo when a business owner asks us to scan their website and generate a visibility report. It reads public pages to understand the business name, services, location, structured data, sitemap, robots policy, and crawler readiness.
User agent
CabooBot/0.1 (+https://getcaboo.com/bot)
What it fetches
For the free scan, the bot fetches the homepage, robots.txt, sitemap.xml, and favicon.ico. It also looks for llms.txt for completeness, but llms.txt is informational only — it does not contribute to the Caboo Score. It uses a short timeout and a small byte limit. It does not crawl the whole site in the free scan.
Provider crawlers Caboo monitors
Beyond fetching the site itself, Caboo evaluates whether the major AI providers' own crawlers can reach the site. Caboo does not impersonate these bots — it reads the site's robots policy and reports which of them are allowed.
OAI-SearchBot OpenAI · ChatGPT Search index ChatGPT-User OpenAI · live user-triggered fetches GPTBot OpenAI · model training Claude-SearchBot Anthropic · Claude search index Claude-User Anthropic · live user-triggered fetches ClaudeBot Anthropic · model training Google-Extended Google · Gemini API grounding (separate from Googlebot) PerplexityBot Perplexity · search index
Blocking OAI-SearchBot, Claude-SearchBot, Google-Extended, or PerplexityBot typically removes a site from that provider's AI-search surface. Blocking the training-only bots (GPTBot, ClaudeBot) is a different decision.
Robots policy
CabooBot respects robots.txt for unverified domains. If a verified site owner wants a deeper scan, they can allow CabooBot explicitly or verify domain ownership before requesting an override.
User-agent: CabooBot Allow: /
How to block it
User-agent: CabooBot Disallow: /
Contact
Questions or crawl issues: [email protected].