The Problem: 66/100 on Our Own Scanner
When we ran our own AI visibility scanner against citability.ai, the result was awkward: 66/100. We were preaching the Citation Economy gospel while our own site only had 6 of the 20 ADP endpoints in place. So we sat down and shipped the missing 14 in an afternoon. This post walks through what we built, why each endpoint matters, and the open-source tooling we extracted from the work.
What is the AI Discovery Protocol?
The AI Discovery Protocol (ADP) is a collection of well-known endpoints that make a website parseable by AI agents. Think of it as the AI-era equivalent of robots.txt + sitemap.xml + RSS — but for ChatGPT, Claude, Perplexity, and Gemini instead of Googlebot.
ADP groups 20 endpoints into four categories:
| Category | Endpoints | Purpose |
|---|---|---|
| Core (8) | llms.txt, llms-full.txt, llms-lite.txt, robots.txt, .well-known/ai.json, ai-discovery.json, ai-discovery.md, knowledge-graph.json | Tell AI what your site is and how to navigate it |
| Feeds (4) | feed.json, rss.xml, updates.json, ai-sitemap.xml | Surface fresh content for retrieval |
| News (4) | news/llms.txt, news/speakable.json, news/changelog.json, news/archive.jsonl | Time-aware updates AI can cite |
| API (4) | openapi.json, api/v1/adp/stats, api/v1/webhooks/discovery, opensearch.xml | Machine-callable interfaces |
What We Already Had (66/100)
Our starting state:
/llms.txt— basic version, 11 sections/robots.txt— AI crawlers allowed/.well-known/ai.json/ai-discovery.json/knowledge-graph.json/ai-sitemap.xml
What We Added (the missing 14)
Core endpoints (3 new)
llms-full.txt — A comprehensive 33-line context document covering features, technology stack, API surface, supported standards, and contact info. This is the version we want AI agents to read when they have token budget. We include the full architecture, the 5-pillar Citation Readiness V2 scoring, and the 51 protocol endpoints we scan.
llms-lite.txt — A single paragraph. Two sentences. The version we want AI to read when token budget is tight. This file is the quick mention — it should answer "what is this site?" in under 50 words.
ai-discovery.md — The same content as ai-discovery.json but in markdown form. Some agents prefer markdown over JSON for context injection because it preserves prose structure.
Feed endpoints (3 new)
feed.json — JSON Feed 1.1 format. Empty items array initially, but the structure is ready to populate when blog posts publish. This is what content aggregators consume.
rss.xml — RSS 2.0 with the same structure. Belt and braces — many AI crawlers still prefer RSS over JSON Feed.
updates.json — A custom format we use for platform changelog. Not yet a standard, but we expect convergence with the changelog.json spec by end of 2026.
News endpoints (4 new)
news/llms.txt — News-specific LLM context. Differs from the root llms.txt by focusing on time-sensitive content. AI agents that ask "what's new at X?" can retrieve this directly.
news/speakable.json — Schema.org SpeakableSpecification JSON. Tells voice assistants which CSS selectors contain content suitable for text-to-speech. We mark our hero descriptions and executive summaries as speakable.
news/changelog.json — Structured version history. We populate this from our git tags now.
news/archive.jsonl — JSONL format historical archive. Agents can stream this as a context source.
API endpoints (4 new)
openapi.json — Our FastAPI auto-generates this. The trick was that FastAPI disables it in production by default. We had to flip three lines:
app = FastAPI(
docs_url="/docs", # was: "/docs" if DEBUG else None
redoc_url="/redoc", # was: "/redoc" if DEBUG else None
openapi_url="/openapi.json", # was: conditional on DEBUG
contact={"name": "Citability", "email": "[email protected]"},
license_info={"name": "Proprietary", "url": "https://citability.ai/terms"},
servers=[{"url": "https://citability.ai"}],
)
Then a Next.js proxy route exposes it through the public ingress (our backend is internal-only on DigitalOcean App Platform). Now https://citability.ai/api/openapi.json returns 155 documented endpoints.
api/v1/adp/stats — A live endpoint that returns ADP version, supported standards, and our own ADP compliance level. Self-referential, but valuable for monitoring tools.
api/v1/webhooks/discovery — Returns a catalogue of webhook events Citability emits. Lets AI agents discover async event streams without reading docs.
opensearch.xml — OpenSearch description document. Lets browsers and AI tools register Citability as a search provider that searches /scan/{searchTerms}.
The Result
After 14 file additions and one config change: 100/100 ADP. All 20 endpoints found, all 4 categories at 100%. The change shipped in a single commit, took an hour to implement, and has compounding benefits as more AI agents start consuming these endpoints.
What We Open-Sourced Along the Way
While building this, we extracted two of our internal tools into standalone open-source projects:
llms-txt-validator
A Python package that validates llms.txt files against the llmstxt.org spec. Scores 0–100, flags missing sections, validates URLs, available as a CLI:
pip install llms-txt-validator
llms-txt-validate citability.ai
adp-audit-cli
A CLI that checks all 20 ADP endpoints on any domain in parallel and gives a category breakdown:
pip install adp-audit-cli
adp-audit example.com
Both are MIT licensed. Use them, fork them, contribute back.
The Bigger Picture
ADP is still a young standard. Most of the endpoints we implemented don't have industry-wide convergence yet. But the cost of implementation is low (14 static files plus one config flag), the downside risk is zero, and the upside is being parseable by AI agents that haven't shipped yet.
If you're building for the AI-first web, treat ADP like you treated mobile-first design in 2012: get it in early, iterate as the standards mature, and you'll have a structural advantage when the rest of the web catches up.
Want to know your ADP score? Run a free scan at citability.ai/scan — we check all 20 endpoints, score the content quality, and tell you exactly which files to add. It takes about 15 seconds.