For running validators and tools
Any static or dynamic site
Depending on site complexity
This YAML file is the single source of truth for AI agents. It declares your content, actions, and policies.
# Agent-Ready Web Discovery Manifest
# https://yoursite.com/llms.txt
version: 1.0
profile: ARW-1
# Site Information
site:
name: "Your Site Name"
description: "Brief description of your site"
homepage: "https://yoursite.com"
contact: "ai@yoursite.com"
# Machine-Readable Content
# Note: Last-modified dates are in sitemap.xml per web standards
content:
# Homepage
- url: /
machine_view: /index.llm.md
purpose: homepage
priority: high
# About page
- url: /about
machine_view: /about.llm.md
purpose: documentation
priority: medium
# Product or content pages
- url: /products/example
machine_view: /products/example.llm.md
purpose: product_information
priority: high
chunks:
- id: product-summary
heading: "Product Overview"
- id: product-specs
heading: "Specifications"
# Usage Policies
policies:
training:
allowed: false
note: "Content not licensed for model training"
inference:
allowed: true
restrictions: ["attribution_required"]
attribution:
required: true
format: "link"
template: "Source: Your Site Name (https://yoursite.com)"Machine views are clean Markdown versions of your pages, optimized for AI agents. 85% smaller than HTML, no navigation or ads.
# Your Site Name
<!-- chunk: overview -->
## Overview
Clear, semantic content optimized for LLM parsing.
No navigation, ads, or clutter—just pure content.
We provide [product/service] for [target audience].
<!-- chunk: features -->
## Features
- **Feature 1**: Description of feature 1
- **Feature 2**: Description of feature 2
- **Feature 3**: Description of feature 3
<!-- chunk: contact -->
## Contact
- Email: contact@yoursite.com
- GitHub: https://github.com/yourusername
- Twitter: @yourhandleUse the ARW CLI to generate machine views from your HTML:
npx arw@alpha generate ./pages --recursiveAdd meta tags to your HTML `<head>` and HTTP headers for discovery and observability.
<head>
<!-- ARW Discovery -->
<link rel="alternate" type="text/x-llm+markdown" href="/index.llm.md" />
<link rel="alternate" type="application/yaml" href="/llms.txt" />
</head># For .llm.md files
Content-Type: text/x-llm+markdown; charset=utf-8
AI-Attribution: required
AI-Inference: allowed
AI-Training: not-allowed
# For /llms.txt
Content-Type: application/yaml; charset=utf-8location ~ \.llm\.md$ {
add_header Content-Type "text/x-llm+markdown; charset=utf-8";
add_header AI-Attribution "required";
add_header AI-Inference "allowed";
add_header AI-Training "not-allowed";
}
location = /llms.txt {
add_header Content-Type "application/yaml; charset=utf-8";
}Use the ARW validator to check compliance and identify issues.
npx arw-validator https://yoursite.com# Python validator
python tools/validators/validate-arw.py ./public/llms.txt
# Node.js validator
node tools/validators/validate-arw.mjs ./public/llms.txt schemas/arw_model.jsonOpen the ARW Inspector web tool and enter your URL:
Launch ARW Inspector55KB HTML → 8KB Markdown
2 seconds vs 5 minutes
Typical medium site
You've implemented ARW-1 (Discovery Ready). AI agents can now:
Add advanced features:
Visualize your implementation: