Technical · AI Crawler Infrastructure

Is your site readable by AI?

ChatGPT Search, Perplexity and Gemini crawl your site every day. Most sites are either blocked or unreadable. We configure robots.txt per crawler, deploy llms.txt, write citation-ready JSON-LD, and restructure content for RAG retrieval.

9
AI crawler types
llms.txt
new robots.txt
48 h
technical scope
JSON-LDschema.org
{
  "@context": "schema.org",
  "@type": "LocalBusiness",
  "name": "Marka Adı",
  "areaServed": "İstanbul",
  "hasOfferCatalog": {…}
}
llms.txt
> Service: Ana hizmet
> URL: /hizmet-sayfasi
robots.txtllms.txtschema
94/100
(01) System

Work is split into operational layers.

4 LAYERS · RANKMATE
01 · Layer

Per-Crawler Access Control

GPTBot, OAI-SearchBot, anthropic-ai, PerplexityBot, YouBot — each has a different purpose. Managing a training scraper and a live search crawler under the same rule wastes crawl budget and creates legal exposure. We write a separate directive set for each.

robots.txt — AI CRAWLERS
User-agent: GPTBot
Allow: /blog/
Allow: /hizmetler/
Disallow: /admin/

User-agent: OAI-SearchBot
Allow: /
GPTBotOAI-SearchBotanthropic-aiPerplexityBot
02 · Layer

llms.txt — Your Site's AI Map

llms.txt gives AI agents a machine-readable map of your site — which pages exist, what they mean, and which sources can be cited. Think of it as robots.txt rebuilt for the LLM era.

llms.txtsite map for AIsemantic context
03 · Layer

Citation-Ready JSON-LD Schema

RAG pipelines run entity extraction. Without correct Organization, Service, FAQ and Review schemas, AI cannot recognize you as a named entity — but it recognizes your competitor. We write intent-matched schema for every page type.

schema.orgentity extractionE-E-A-T
04 · Layer

RAG-Ready Content Architecture

LLMs read in chunks. Long paragraphs get lost; short, citable, semantically dense blocks get retrieved. We restructure service descriptions, FAQs, and social proof into retrievable units.

chunkinganswer blocksretrieval
(02) Technical files

Three files that open the door to AI systems.

robots.txt
# Canlı arama — izin ver
User-agent: OAI-SearchBot
Allow: /

# Eğitim verisi — kısıtlı
User-agent: GPTBot
Allow: /blog/
Disallow: /

# Anthropic
User-agent: anthropic-ai
Allow: /
llms.txt
# RankMate
> GEO ve AI görünürlük ajansı
> URL: https://rankmate.agency/

## Hizmetler
> GEO + SEO Büyüme paketi
> URL: /hizmetler/geo-seo/

> WhatsApp AI Satış Asistanı
> URL: /hizmetler/whatsapp-ai/
JSON-LD
{
  "@type": "ProfessionalService",
  "name": "RankMate",
  "knowsAbout": [
    "GEO", "AI SEO",
    "LLM Visibility"
  ],
  "areaServed": "TR"
}
94
/ 100 · Trust score
(03) Technical score breakdown

Crawlable, readable, citable.

Technical score is a weighted average of bot access rules, llms.txt quality, schema coverage, page speed, and RAG content density.

Bot access rules
98
llms.txt quality
94
Schema coverage
91
RAG content density
89
Page speed
93
(04) Common questions

Questions before you start.

How do you know a site is blocked from AI?

If server logs show no GPTBot or OAI-SearchBot requests, they're either getting 403s or robots.txt is blocking them completely. We detect this in 5 minutes with a free audit.

Is llms.txt a standard?

Not yet a W3C standard, but Anthropic, OpenAI, and Perplexity all actively read it. A site without one is like a city without a map — AI doesn't know where to look.

Can you add this without breaking existing robots.txt?

Yes. We append only AI crawler directives and deploy llms.txt without touching your existing SEO rules. We test on staging first.

Is schema enough on its own?

No. Schema is necessary but not sufficient. Incorrect or duplicate markup actively hurts you. Without llms.txt and RAG-ready content on top, AI still won't recognize you as an entity.

(06) Start · 5 minutes · freeRM/ANALYSE-26

Make your brand visible inside AI answers.

~5 min
analysis time
48 h
scope delivery
$0
upfront cost
Message us on WhatsAppFast reply