# The Five Layers of AEO

Render, navigate, structure, content, measure. Miss one and the next can't help you.

By AgentSite · 6 min read · Updated 2026-05-23

Agent readability is the foundation of AEO. SEO was a ranking problem; AEO is a citation problem — either ChatGPT pulls a paragraph from your page, or it doesn't. Five layers stack in order: render, navigate, structure, content, measure. Miss one and the next can't help you.

The layers are a chain, not a checklist. Layer 1 is whether a bot can read the bytes you serve. Layer 5 is whether anyone is quoting you. Everything in between decides whether the trip from bytes to citation actually happens. Most sites have a real problem on at least three of the five and don't know it.

This is the model we use at AgentSite. There's a longer treatment in our [AEO essay](/aeo); this page is the map.

## Layer 1 — Render: can the bot read the bytes?

A live reader hits your URL. GPTBot, ClaudeBot, PerplexityBot, the ChatGPT user-initiated fetch. None of them execute JavaScript. Vercel measured 569 million GPTBot requests across their network and 370 million from Claude in a single month, and reported that "none of the major AI crawlers currently render JavaScript." ([Vercel, "The Rise of the AI Crawler," Dec 2024](https://vercel.com/blog/the-rise-of-the-ai-crawler).)

If your site is a Vue / React / Svelte / Angular single-page app — or anything built with Lovable, v0, or Bolt — every one of those agents sees `<div id="app"></div>` and leaves. No error. No retry. No log entry on your side that says "GPTBot tried and gave up." You just stop existing for that question.

This is the layer that gates everything else. Cite the cleanest framework in the world; if your page renders client-side, the citation engine never sees the framework. Server-rendered HTML on every route is table stakes.

## Layer 2 — Navigate: can the bot find what's where?

Once a bot can read individual pages, the next problem is inventory. Which pages exist? Which one answers the question being asked? `sitemap.xml` has done this job for search crawlers for two decades. It doesn't quite fit for agents — sitemaps are link dumps; an agent ingesting your site wants a curated overview that fits in a context window.

Jeremy Howard proposed [`/llms.txt`](/llms-txt) in September 2024 as the standard for that overview: "A proposal to standardise on using an `/llms.txt` file to provide information to help LLMs use a website at inference time." ([llmstxt.org](https://llmstxt.org/).) Short markdown, links into detail at `/install.md` and `/docs.md`, no JavaScript in sight.

Per-page `.md` mirrors are the same idea at the route level. Every page exists at both `/path` and `/path.md`. Markdown reads better to language models than HTML reads — fewer tokens, less chrome, same content. The two files compose: `llms.txt` is the index, `/path.md` is the chapter.

If layer 1 is "can the bot read?", layer 2 is "can the bot navigate?". A site with neither file is a maze; with both, it's a manual.

## Layer 3 — Structure: does the page say what it is?

A page can be rendered, indexed, and still be ambiguous to an agent. Is it an article? A FAQ? A how-to? A product page? An organization homepage? The page itself doesn't say. Human visitors infer from layout; agents need it stated.

That's what JSON-LD does. [Schema.org](http://Schema.org) defines a vocabulary of types — `Article`, `FAQPage`, `HowTo`, `Organization`, `BreadcrumbList`, `Product` — each with required fields. A `<script type="application/ld+json">` block at the top of the page declares the type, the author, the publication date, the headline. Agents lift those fields directly when deciding whether to quote the page and how to attribute it.

`FAQPage` is the type that pays off most for AEO because the citation surface is literal: question-as-heading, answer-as-paragraph, both wrapped in schema, both extractable by the agent verbatim. The agent doesn't need to guess what your H2 means. You told it.

Layers 1 and 2 say whether the bot can read you. Layer 3 says what kind of thing each page is. A page without it isn't broken; it's just less quotable than the same page with it.

## Layer 4 — Content: is the page worth quoting?

Now the bot can read, navigate, and type your pages. Whether it actually quotes you is content quality. This is the only layer no tool can ship for you.

The peer-reviewed work here is the GEO paper out of Princeton. Aggarwal et al., KDD 2024, tested content-optimization tactics in a controlled experiment against generative engines and found "GEO can boost visibility by up to 40% in generative engine responses." ([Aggarwal et al., 2024, arXiv:2311.09735](https://arxiv.org/abs/2311.09735).) The strongest individual tactics were adding statistics, citing named sources inline, and including direct quotes. Keyword stuffing was the only tactic with a measured negative effect.

Translated to a writing rule: cite named sources with named statistics. Don't pad with keywords. Lead each page with the answer in the first 40-60 words so an agent can lift it verbatim. The page you're reading does the same thing.

This layer is human work. AgentSite renders, generates, and validates the technical layers below it. Layer 4 is whatever your writers and engineers actually put on the page.

## Layer 5 — Measure: who quotes you when?

The last layer is external. You don't ship it; you observe it. Are you being mentioned in AI answers? Which engines? For which prompts? Which competitors come up instead?

The traffic itself is real. Cloudflare reported in July 2024 that AI bots had accessed roughly 39% of the top one million Internet properties in a single month, with GPTBot alone reaching 35.46% of them. ([Cloudflare, "Declaring Your AIndependence," July 2024](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click/).) Only 2.98% of the top million were actively blocking AI bots when that report ran — the other 97% were available for citation, in principle. Whether each one was actually getting cited is layer 5.

This is what mention-tracking tools measure: send a panel of category prompts to the major engines on a schedule, count how often you come up, who else does. The number that matters is inclusion rate, per engine, sampled often enough to average out the non-determinism of model output.

Layer 5 isn't an end state. It's the feedback signal for everything below it. If your inclusion rate is zero, the question is which earlier layer is broken — not which marketing line to A/B test.

## Why the order matters

The layers don't just stack; they gate. A perfect `FAQPage` schema does nothing if the bot couldn't render the page. A great `llms.txt` does nothing if the linked pages return empty HTML. A 40-point content lift from the GEO playbook does nothing if no agent is reaching the content in the first place.

The expensive mistake is to start in the middle. Plenty of teams discover AEO through layer 5 — they buy a monitoring tool, see a low inclusion rate, and start tuning content. The content was probably fine. The page rendered empty.

Start at layer 1. Then layer 2. Then layer 3. The first three are deterministic technical work. Layer 4 is where your writers earn their keep, with a scorecard pointing them at the highest-impact changes. Layer 5 is the receipt at the end.

## What this means for you

Web Claude can't read your website. Most sites are losing on at least three of these layers right now, silently, because failed bot reads don't surface in analytics.

The diagnostic is 90 seconds. [Run your AEO score](/score) — eight dimensions across the five layers, plus a live citation probe. You get a punch list ranked by impact. You decide what to do about it.

For more depth on any individual layer, the [AEO essay](/aeo) covers each one in detail. [AEO in Pictures](/aeo-in-pictures) is the same story told as diagrams and before/after code. The [docs](/docs) describe what the AgentSite middleware emits at each layer when you install it.