Complementary Standards

ARW + llms.txt

Agent-Ready Web doesn't replace existing web standards—it extends and enhances them. ARW builds on top of robots.txt, sitemap.xml, llms.txt, and RFC 8615 to create a complete infrastructure layer for AI agents.

How ARW Fits Into the Web Stack

ARW sits at the top of the web infrastructure stack, building on established standards.

Human Web

What humans see and interact with

HTMLOpenGraphSchema.org

Crawler Web

How search engines discover content

robots.txtsitemap.xml

Metadata Web

Where site-wide metadata lives (RFC 8615)

.well-known/*llms.txt

Agentic Web (ARW)

How agents understand, attribute, and act

.llm.md viewsAI-* headersOAuth actionsPolicies

Key Insight: ARW is the top layer that defines what's inside and how agents act. The layers below define where content lives and who can access it.

What Each Standard Provides

ARW complements existing standards by adding semantic understanding and operational capabilities

Featurerobots.txtsitemap.xmlllms.txtARW
Primary PurposeAccess control for crawlersURL discovery & indexingAI-readable content manifestEnd-to-end agent interoperability
ScopeCrawler permissionsSite structureBasic content discoveryDiscovery → Semantics → Actions → Protocols
FormatPlain textXMLMarkdown or YAMLYAML + JSON schemas
Content AwarenessNoneURLs onlyHuman-readable summariesRich content graph with chunks
ActionabilityNoneNoneNone✅ OAuth actions (add_to_cart, etc.)
Policy ControlDisallow/Allow crawlersNoneImplicit✅ Explicit training/inference policies
ObservabilityNoneNoneNone✅ AI-* header namespace
InteroperabilityNoneNoneNone✅ Bridges MCP / ACP / A2A
Conformance LevelsNoneNoneNone✅ ARW-1 → ARW-4 progressive

Progressive Enhancement Path

Start with llms.txt and progressively enhance to full ARW conformance. No need to rewrite everything.

Migration Path

0

Start: llms.txt

Basic content manifest

1

ARW-1: Discovery

Add YAML structure + .well-known files

2

ARW-2: Semantic

Add .llm.md views + chunks + AI-* headers

3

ARW-3: Actions

Add OAuth-enforced actions

4

ARW-4: Protocols

Add MCP / ACP / A2A support

What You Keep

All existing HTML pages

ARW adds .llm.md views alongside HTML, doesn't replace it

SEO & social previews

OpenGraph, Schema.org, meta tags all remain

robots.txt & sitemap.xml

ARW reads these for freshness data

Existing authentication

OAuth actions use your current auth system

Your API endpoints

ARW adds OAuth wrapper, keeps your logic

Key Principle: Both/And, Not Either/Or

ARW is designed for progressive enhancement. You don't replace llms.txt—you enhance it with YAML structure. You don't replace HTML—you add .llm.md views. You don't replace your API—you add OAuth enforcement. Everything is additive.

What ARW Adds Beyond llms.txt

Six novel capabilities that transform basic discovery into full agent interoperability

Machine Views (.llm.md)

First-class content type (text/x-llm+markdown) with chunk addressability and 85% token reduction

Content-Type: text/x-llm+markdown

AI-* Headers

Track which agents access what content and how they use it

AI-Agent: ChatGPT/1.0
AI-Inference: allowed

OAuth-Enforced Actions

Let agents complete transactions with user consent

actions.add_to_cart
actions.create_order

Chunk Addressability

Reference specific sections of content with stable IDs

<!-- chunk:shipping-policy -->

Machine-Readable Policies

Explicit training/inference rules with attribution requirements

training: disallowed
inference: allowed

Protocol Interoperability

Bridge to MCP, ACP, and A2A for agent-to-agent communication

protocols.mcp
protocols.acp

The Yellow Pages vs. API Contract

robots.txt and sitemap.xml are like the yellow pages of the web—they tell you where things are and whether you can access them.

llms.txt adds a table of contents—a human-readable guide to what's important.

ARW is the API contract for the agentic web—describing not just where to look, but what you can do, how to do it safely, and under what terms.

Complete Stack Working Together

robots.txt →"You can crawl /docs but not /admin"
sitemap.xml →"Here are all my pages with last-modified dates"
llms.txt →"These pages are most relevant for AI agents"
.well-known/arw-* →"Here's the structured data (policies, actions, content index)"
.llm.md →"Here's the content optimized for your context window"
AI-* headers →"I can track which agents are accessing what"
OAuth actions →"Agents can add to cart with user consent"

When to Use What

Use llms.txt alone if:

  • You have a simple content site (blog, docs, marketing)
  • You don't need agent-initiated actions
  • You want a quick 30-minute implementation
  • You're just getting started with AI agent support

Add ARW if you need:

  • Transactional capabilities (e-commerce, bookings, forms)
  • Observability into agent traffic and usage
  • Token efficiency (85% reduction vs HTML scraping)
  • Explicit policies for training/inference control
  • Agent-to-agent protocols (MCP, ACP, A2A)
  • Progressive conformance with ARW-1 → ARW-4 badges

Start with llms.txt, Enhance with ARW

Both are valid. Both are useful. ARW just takes you further when you need it.