← Blog

July 2, 2026 · 6 min read

The agent-ready docs checklist: 7 things to do before your next deploy

Developers are increasingly asking AI assistants about your product before they ask Google or your support team. If your docs aren't structured for agent consumption, you're invisible to that channel. Here's the checklist — everything you should have in place to make your documentation work for AI agents as well as humans.

01Have a sitemap

A well-structured sitemap.xml at your root tells crawlers (and AI indexers) which pages exist and which are important. Most doc platforms generate this automatically — Docusaurus, Mintlify, ReadMe all do. Check that yours is accessible at /sitemap.xml. If you're on a custom setup, generate it.

Why: Without a sitemap, indexers can only crawl pages they can discover through links. You might have your most important content orphaned.

02Publish an llms.txt

Add a plain-text file at /llms.txt that describes your product and links to key pages. This is read by LLMs during training and general context-building. It's a 20-minute job that improves how AI models understand your product for the next training cycle.

# YourProduct
> One-sentence description.

## Documentation
- [Getting Started](https://docs.yoursite.com/quickstart)
- [API Reference](https://docs.yoursite.com/api)
- [Authentication](https://docs.yoursite.com/auth)

03Index your docs via MCP

Submit your docs URL to AgentReady. This indexes your content for real-time querying by AI agents via MCP — the protocol used by Claude Desktop, Cursor, and other AI dev tools. Indexed content is queryable within 60 seconds.

Why: llms.txt helps at training time. MCP indexing helps at runtime — when a user is actively coding and asks Claude about your API.

04Serve real HTML, not just JavaScript

AI crawlers generally can't execute JavaScript. If your docs are a single-page app that renders entirely in the browser, crawlers see an empty shell. Use server-side rendering (SSR) or static generation. Docusaurus and most other doc platforms do this by default.

To test: disable JavaScript in your browser and load a docs page. If it's blank or shows a spinner forever, you have a problem.

05Use clean headings and structure

Content chunked for AI retrieval is split by token count, but headings provide natural break points that preserve semantic context. A page with clear H2/H3 structure will chunk better and retrieve more accurately than a wall of text.

Concretely: don't write "the following configuration options are..." — write a heading for each option. Don't bury key terms in prose — put them in headings or code blocks where they stand out to both humans and embedding models.

06Set up robots.txt to allow crawling

Check that /robots.txt doesn't accidentally block crawlers from your docs. Some setups block all bots or block specific user-agents. If you've ever seen "AI crawlers blocked", that also blocks legitimate indexers.

You can selectively block specific bots while allowing indexers you want. The default is usually fine — just make sure you haven't set Disallow: / globally.

07Keep the index fresh

Indexed content goes stale when your docs change. Set up automatic re-indexing on publish. If you're on WordPress, the AgentReady WordPress plugin re-indexes on post publish automatically. For other setups, call the API on deploy: POST /api/crawl with your URL.

None of these are big projects. The whole checklist is a day's work, and most of it is 20 minutes each. The payoff is that AI agents stop giving wrong answers about your product — which is happening whether or not you've done anything about it.