# AEO in Pictures

The snippet, the cache, the response — and why every visitor gets the same HTML.

By AgentSite · 8 min read · Updated 2026-05-23

Read [the long-form essay](/aeo) first if you want the _why_. This page is the _what_. Diagrams and code samples grounded in the real snippet code at `packages/express/agentsite.cjs`.

The single architectural fact that holds everything else together: **the snippet does not branch on who is asking.** Every visitor — human browser, GPTBot, ClaudeBot, Perplexity, your team's `curl` — receives byte-identical HTML. The branch is on the _kind of response_ (HTML page vs. site-level agent asset vs. static file), not on the user agent. Bot identity is something we _log_, not something we route on.

* * *

## The model in one diagram

HTML page  
SPA route, e.g. /pricing

Site-level agent asset  
llms.txt, sitemap, robots.txt,  
.well-known/ai-agent.json, etc.

Static file  
JS, CSS, images, fonts

Incoming request  
any UA, any visitor

AgentSite snippet  
in your stack

What's being asked for?

Fetch RenderBundle  
snippet cache → backend

Snippet-owned route  
asset cache → backend

Pass-through  
express.static / upstream

applyBundle:  
title, meta, OG, JSON-LD,  
schema graph, markdown body

Enriched HTML

Agent-readable doc

Origin bytes, untouched

Three response classes. One enrichment path (HTML pages), one fast cached-asset path (site-level agent docs), one pass-through path (everything else). The branch is on what's being served — not on who's asking.

* * *

## The four lines

```ts
// server.ts (Express; Next/Nuxt/Astro/etc. have the same one-import shape)
const express = require('express')
const path = require('path')
const agentsite = require('./agentsite')

const app = express()
const DIST = path.join(__dirname, 'dist')
app.use(express.static(DIST, { index: false }))
app.get('*', agentsite({
  distDir: DIST,                     // ← read index.html from disk
  site: 'https://yourdomain.com',
  token: process.env.AGENTSITE_TOKEN,
}))
app.listen(3000)
```

That's the install. No browser to operate. No render queue to manage. No cache to run. No standards to track.

* * *

## What every visitor sees

Same URL. Same fetch. Same response bytes regardless of who's calling. The difference is what each consumer reads from those bytes.

### Raw shell — what your SPA ships by default

```bash
$ curl -H "X-Agentsite: none" https://yoursite.com/pricing
```

The `X-Agentsite: none` header tells the snippet to skip enrichment — that's the only way to see the un-enriched shell. (We use this flag internally so re-rendering an already-AgentSite-running site doesn't double-inject; the landing-page "before injection" demo uses it too.)

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>YourApp</title>
    <link rel="stylesheet" href="/assets/index-d8f3a.css" />
    <script type="module" src="/assets/index-9e21b.js"></script>
  </head>
  <body>
    <div id="app"></div>
  </body>
</html>
```

Word count: zero. Structured data: zero. Citation likelihood: zero. The crawler leaves.

### Enriched shell — what AgentSite serves to every visitor

Same `curl`, no header flip. The snippet fetches a RenderBundle from the backend (or pulls it from its in-memory cache), then patches `<head>`, injects JSON-LD, and splices the markdown body into a preboot `<div>` just before `</body>`. A `<style>` rule in `<head>` hides that div the moment JavaScript boots, so humans never see it; bots and extractors that strip `<noscript>` read it cleanly. (Per-site dashboard override switches to legacy `<noscript>` mode for tools that honor it — Claude Code WebFetch.)

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Pricing — YourApp</title>
    <meta name="description" content="YourApp pricing: Free, Pro at $49/mo, Studio at $149/mo. Includes per-page render limits, FAQ schema injection, and dashboard access." />
    <meta property="og:title" content="Pricing — YourApp" />
    <meta property="og:description" content="..." />
    <meta property="og:image" content="https://yoursite.com/og/pricing.png" />
    <link rel="canonical" href="https://yoursite.com/pricing" />

    <script type="application/ld+json" data-agentsite="bundle">
    { "source": "agentsite", "engine": "agentsite-renderer", "cacheStatus": "hit", "url": "/pricing", "renderedAt": "...", "expiresAt": "..." }
    </script>

    <script type="application/ld+json" data-agentsite="schema" data-type="FAQPage">
    { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [ ... ] }
    </script>

    <link rel="stylesheet" href="/assets/index-d8f3a.css" />
    <script type="module" src="/assets/index-9e21b.js"></script>
  </head>
  <body>
    <div id="app"></div>
    <div id="agentsite-preboot" aria-hidden="true">
      <pre data-content-type="text/markdown" data-source="agentsite">
# YourApp Pricing

> Free for hobby projects. Pro for active sites. Studio for content-heavy or UGC products. See /pricing for current rates.

## What's included in the Free tier?
Free includes a one-time kickoff credit, 7-day cache, and pay-as-you-go top-up.

## How does Pro pricing work?
Pro tops up your weekly plan units; see /pricing for current allotments and rates.

...
      </pre>
    </div>
  </body>
</html>
```

Same bytes whether a human browser or GPTBot just fetched it. The next diagram shows why that works.

* * *

## Same response, two readers

The enriched HTML above  
byte-identical for every visitor

Human browser  
JS enabled

Bot / crawler / agent  
no JS execution

Boots SPA from div#app,  
preboot div hidden by inline CSS

Reads head meta,  
JSON-LD blocks,  
preboot markdown body

The designed UI

Title, description, OG,  
schema graph, full page markdown

This is the trust-load-bearing diagram. There is no cloaking. There is no per-bot tailoring. There is one response. The parts of HTML the browser hides — the preboot `<div>` hidden by an inline `<style>` rule once `data-agentsite-js="1"` is set on `<html>`, and the `<script type="application/ld+json">` blocks the renderer doesn't evaluate as page markup — are exactly the parts agents are wired to read. The HTML standard works in our favor.

* * *

## The markdown mirror

Append `.md` to any page path and the snippet returns the body in raw `text/markdown`:

```bash
$ curl https://yoursite.com/pricing.md
```

```markdown
# YourApp Pricing

> Free for hobby projects up to 50 pages. Pro at $49/month for active sites. Studio at $149/month for content-heavy or UGC products.

## What's included in the Free tier?

Free includes 50 pages, 1,000 renders per month, and 7-day cache.

## How does Pro pricing work?

Pro is $49 per month. You get 500 pages per site, 25,000 renders per month, daily cache freshness, and full dashboard access.

…
```

Same content as the preboot body, denser format, no HTML noise. Language models prefer reading markdown — lower token cost, higher comprehension. `Accept: text/markdown` also works on every route.

* * *

## Site-level agent assets

The snippet also owns a small set of canonical routes. They aren't rendered per request — they're served from the snippet's own asset cache, refreshed against the backend every 1–24 hours depending on volatility. Customer-shipped files in `dist/` always win (express.static runs first), so a hand-rolled `dist/llms.txt` overrides ours automatically.

Layer 4 — Agentic frontier

/.well-known/ai-agent.json

/.well-known/agents.json

/.well-known/agent-skills/index.json

/.well-known/api-catalog

/.well-known/mcp/server-card.json

/ai.txt

Layer 3 — Structured data

Article JSON-LD

FAQPage JSON-LD

BreadcrumbList

Organization sameAs

Layer 2 — Content map

/llms.txt

/llms-full.txt

/path.md mirrors

Layer 1 — Crawler access

robots.txt — granular AI policy

sitemap.xml

Layer 3 (JSON-LD) lands inside the enriched HTML response — it's per-page schema generated from each render. Layers 1, 2, and 4 are stand-alone files served from snippet-owned routes. One install lights up all four.

* * *

## The cache, in the middle

Why most requests never reach AgentSite at all:

hit — the common case

miss

hit

miss

Request

Snippet

In-memory  
bundle cache  
5 min TTL, per route

applyBundle → respond  
zero backend cost

POST /render to backend

Backend  
RenderBundle row?

Return cached bundle

Playwright render  
\+ artifact generation

Persist bundle row  
\+ return

Three tiers. The hot path is layer 1: the snippet keeps a 5-minute in-memory bundle per route inside your own process. Once a route is warm, applying the enrichment is single-digit milliseconds and the backend sees nothing. Layer 2 is the backend's persistent RenderBundle row — durable across snippet restarts, shared across all your instances. Only a layer-2 miss triggers a fresh headless render.

**Implication:** for a typical SPA serving heavy bot traffic against a handful of routes, the backend sees a small fraction of the fetches. That's the unit-pricing model — you pay for fresh renders, not for serves. Cached delivery is free.

* * *

## What actually runs per request

The real call path for an HTML route, paraphrased from `packages/express/agentsite.cjs`:

```ts
// agentsiteHandler — the function express calls for every GET *

// 1. Snippet-owned routes branch first. These never reach the bundle path:
//    /env.js, /robots.txt, /sitemap.xml, /llms.txt, /llms-full.txt,
//    /.well-known/{api-catalog, openid-configuration, ai-agent.json,
//    agents.json, agent-skills/index.json, mcp/server-card.json}, /ai.txt
//    Each served from its own asset cache (1h–24h TTL).
if (req.path === '/llms.txt') return serveLlmsIndex(res)
// ... (other site-level asset branches)

// 2. Markdown mirror — '.md' suffix OR Accept: text/markdown.
//    Strips the suffix, fetches the bundle, returns bundle.markdown verbatim.
if (wantsMarkdown(req)) return serveMarkdown(req, res)

// 3. Static file with an extension (.css, .js, .png, ...) — 404 here
//    because express.static would have served it before we ran.
if (req.path.indexOf('.') !== -1) return res.status(404).send('Not found')

// 4. HTML route — fetch the bundle (snippet cache → backend), then enrich.
const bundle = await fetchBundle(req.path)        // hits 5-min in-memory cache
if (!bundle) return res.send(indexHtml)           // backend down: ship raw shell
if (bundle.cacheStatus === 'miss-generating') {   // first-ever fetch for this route
  res.set('Retry-After', '5')
  return res.send(injectGeneratingSignifier(indexHtml))
}
const finalBody = applyBundle(indexHtml, bundle)  // patch meta + JSON-LD + preboot div
res.send(finalBody)

// 5. Telemetry — fire-and-forget, batched, 60s flush.
//    Classifies the UA into a coarse bucket (gptbot / claude-user /
//    perplexity / human / unknown-bot ...) and pushes to a ring buffer.
//    Humans are dropped. The response above does not vary by classification.
recordTelemetry(req, { cacheHit, renderMs, responseBytes })
```

Five steps for a cache hit. Six for a cache miss. The renderer and bundle generators run in our cloud — your stack pays no CPU for them.

* * *

## Telemetry — observed, not switching

The snippet classifies incoming user agents. It uses the classification in exactly one place: counting them.

classifyUserAgent

human dropped

60s flush

Request  
UA = GPTBot, Claude-User,  
Chrome, etc.

Snippet

Identical enriched response  
regardless of UA

Coarse bucket:  
gptbot, claude-user,  
perplexity, …, human

Ring buffer  
500-record cap

POST /telemetry/ingest

Aggregated counts  
by crawler class + route

Dashboard: 'Perplexity fetched  
12 pages this week, ChatGPT 3'

Three rules:

1.  **The response does not vary by bot.** Every visitor receives the same enriched HTML. Telemetry observes; it doesn't switch.
2.  **No request body, no headers beyond a UA bucket and route path.** We don't capture query strings on URLs (PII risk), don't log auth headers, never see cookies. Human-classified requests are dropped before they hit the ring buffer.
3.  **Aggregated, not surveilled.** The dashboard shows _Perplexity fetched 12 pages this week, ChatGPT fetched 3_ — counts and patterns, not individual request traces.

The result is a useful per-site read on which AI engines are actually visiting, which pages they prefer, and how often they return. No competitor has this data — we're the only party in the request path. And we have it without ever tailoring a response.

* * *

## Privacy — what AgentSite's renderer can and cannot see

A common technical-buyer question: _your renderer is hitting my site — what's the exposure?_

AgentSite renderer

cannot reach

Private surface — invisible to AgentSite

Authenticated dashboards

User profile data

Customer records / DB

Admin panel

API keys / secrets

Cookies / sessions

Public surface — what AgentSite reads

Marketing pages

Pricing

Blog posts

Docs

Pre-rendered SPA routes

Headless browser  
No credentials  
No session  
No cookies

Concretely:

-   AgentSite's renderer fetches over the **public, unauthenticated front door** — the same surface GPTBot, Googlebot, or a fresh incognito tab can reach.
-   The renderer does **not** hold credentials, API keys, OAuth tokens, or session cookies. It can't log in, can't impersonate a user, can't reach `/admin`, `/dashboard`, or `/api/private/*` if those routes require authentication.
-   If your routes are properly authenticated, AgentSite cannot see them. **The boundary is your auth check, not our promise.** The same boundary that protects you from any random crawler protects you from us.

In other words: AgentSite can see exactly what GPTBot can see. And nothing more.

* * *

## Side-by-side — homegrown vs. AgentSite

|  | Homegrown AEO | AgentSite |
| --- | --- | --- |
| **Initial setup time** | 3–5 days, technical engineer | Four lines, fifteen minutes |
| **Quarterly maintenance** | 1 day to track standards changes | Zero — snippet tracks the platform |
| **Headless browser to operate** | Yes, in your stack | No, runs on AgentSite Cloud |
| **Files to maintain** | 6+ in parallel (HTML, .md, llms.txt, JSON-LD, OG, agent-card) | Zero — all derived from your live site |
| **Schema-vs-content drift** | Inevitable, manual to detect | Structurally impossible — same render pass |
| **Per-bot tailoring / cloaking risk** | Tempting; flagged by Google | None — same bytes for every visitor |
| **Anti-pattern detection** | Manual, ad-hoc | Validated at publish, blocked if bad |
| **Per-page summaries** | Hand-written | Auto-generated from page content |
| **Agent-fetch telemetry** | None | First-party, every request (observation only) |
| **Lock-in to undo** | Custom code in your repo | Remove four lines, done |
| **Cost of doing it wrong** | Silent — your AEO score is zero | Caught at publish, surfaced in dashboard |

* * *

## The one-screen pitch

```
                       ┌─────────────────────────────────┐
                       │                                 │
  Any visitor ──────►  │   AgentSite install             │  ───► Enriched HTML
  (browser, GPTBot,    │   (Express / Sidecar / Nginx    │       (same bytes for everyone;
   ClaudeBot,          │    today; Edge + SSR SDK in     │        browser CSS-hides the
   Perplexity, curl,   │    pre-release — pick one)      │        preboot div, agents
   …all the same)      │   • Patches head + JSON-LD      │        read it directly)
                       │   • Splices preboot markdown    │
                       │   • Serves llms.txt, sitemap,   │  ───► /pricing.md, /llms.txt,
                       │     .well-known/* from cache    │       /.well-known/ai-agent.json,
                       │   • Logs who showed up          │       /sitemap.xml, /robots.txt, ai.txt
                       │     (telemetry only — never     │
                       │     switches the response)      │
                       │                                 │
                       └─────────────────────────────────┘
```

One install. Every visitor served the same truthful version. Every artifact generated from one source of truth. Every standard tracked. No cloaking — same bytes for everyone, the HTML standard works in our favor.

That's it. That's the product.

* * *

## Read next

-   **[The complete picture](/aeo)** — the long-form essay this page summarizes visually.
-   **[Agent readability](/agent-readability)** — the thesis: why the diagrams above describe a foundation, not a feature set.
-   **[The Five Layers of AEO](/five-layer-aeo)** — the same five-layer model in textual form.
-   **[Direct answer](/direct-answer)** — the 40-60 word paragraph that anchors every Layer-4 win.
-   **Score your site** — paste a URL, get the five-layer breakdown live.
-   **Dashboard** — once installed, see which AI engines are visiting, which pages they prefer, and how often they return.