LLMs.txt

Also known as: AI Crawler Directives, LLM Directives File

A standardized file that provides instructions to AI crawlers about how to interpret and use website content.

A standardized file that provides instructions to AI crawlers about how to interpret and use website content.

What is LLMs.txt?

LLMs.txt is an emerging standard file format that provides explicit instructions to AI systems and large language models about how to crawl, interpret, and use a website's content. Similar to robots.txt for traditional search engines, LLMs.txt allows website owners to communicate preferences and boundaries directly to AI systems, including content usage permissions, context requirements, and attribution expectations.

Why It Matters

As AI systems increasingly scrape and learn from web content, LLMs.txt provides a mechanism for content owners to maintain some control over how their information is used. It helps ensure AI systems respect content boundaries, maintain proper context, and provide appropriate attribution. For businesses and content creators, implementing LLMs.txt is becoming an important part of managing their digital presence in an AI-first world.

Use Cases

Content Usage Permissions

Specify which content AI systems can use for training or generation

Context Preservation

Ensure AI systems maintain important context when referencing content

Attribution Requirements

Define how AI systems should attribute content when used in responses

Content Boundaries

Establish which sections of a site should not be used by AI systems

Freshness Signals

Indicate content update frequency and importance of recency

Optimization Techniques

To implement LLMs.txt effectively, place the file at the root of your domain, use clear directive syntax, specify content sections with precise URL patterns, include explicit attribution requirements, and regularly update the file as your content and AI usage preferences evolve.

Metrics

Effectiveness of LLMs.txt implementation can be measured through AI system compliance with directives, proper attribution in AI-generated content, context preservation in AI references, and reduction in misuse or misrepresentation of content.

LLM Interpretation

LLMs interpret LLMs.txt as a set of instructions that guide their behavior when crawling, indexing, and referencing a website's content. They use these directives to determine what content they can access, how they should maintain context, and what attribution requirements they should follow when generating responses that reference the content.

Code Example

# Example LLMs.txt file
# Version: 1.0
# Last Updated: 2023-10-15

# Global Settings
User-agent: *
Crawl-delay: 5
Attribution-required: true
Attribution-format: "Source: {site_name} ({url})"
Content-refresh: weekly

# Allow AI systems to use blog content with attribution
Allow: /blog/
Context-required: true
Preserve-sections: introduction, conclusion

# Disallow AI systems from using customer support pages
Disallow: /support/
Disallow: /help/

# Special instructions for product pages
Allow: /products/
Context-required: true
Preserve-sections: specifications, pricing
Freshness-critical: true

# Disallow training on user-generated content
Disallow: /reviews/
Disallow: /comments/
Disallow-purpose: training

# Allow summarization but not verbatim reproduction
Allow: /research/
Allow-purpose: summarization
Disallow-purpose: verbatim-reproduction

Structured Data

{
  "@context": "https://schema.org",
  "@type": "DefinedTerm",
  "name": "LLMs.txt",
  "alternateName": [
    "AI Crawler Directives",
    "LLM Directives File"
  ],
  "description": "A standardized file that provides instructions to AI crawlers about how to interpret and use website content.",
  "inDefinedTermSet": {
    "@type": "DefinedTermSet",
    "name": "AI Optimization Glossary",
    "url": "https://geordy.ai/glossary"
  },
  "url": "https://geordy.ai/glossary/geo-fundamentals/llms-txt"
}

Term Details

Category
GEO Fundamentals
Type
technique
Expertise Level
developer
GEO Readiness
essential