Pillar guide · 16 min read

AEO Complete Guide

What Answer Engine Optimization actually is, why it matters now, and how to do it without faking it.

Format
Pillar guide
Updated
Apr 15, 2026
Read time
16 min read

TL;DR

Answer Engine Optimization (AEO) is the discipline of structuring web content so AI assistants like ChatGPT, Perplexity, Google AI Overviews, Claude, and Bing Copilot can extract and cite it cleanly. The work spans passage-level citability, an llms.txt site summary, AI-crawler access, entity and schema definition, and monthly citation monitoring. Done correctly, AEO compounds; done as an afterthought, it produces nothing. FPWS treats it as a first-class engineering discipline.

01

What AEO actually is

Answer Engine Optimization (AEO) is the practice of structuring web pages so AI assistants can extract clean, attributable answers from them. AEO covers passage-level writing, llms.txt, AI-crawler access, entity and schema definition, and citation monitoring across ChatGPT, Perplexity, Google AI Overviews, Claude, and Bing Copilot. It is not a rebrand of SEO. It is a different optimization target with overlapping technical foundations.

AEO is what you do once you accept that a meaningful share of search-driven attention is no longer routed through ten blue links. ChatGPT search, Perplexity, Google AI Overviews, Claude in Sonar mode, Bing Copilot, and Apple Intelligence summaries all do the same thing in different packaging: they read the web, they synthesize an answer, and they cite a few sources inside that answer. The cited sources get the click, the brand mention, and the trust transfer. Everyone else gets nothing.

Classical SEO is still real and still rewards depth. AEO is the second job. The work is structurally similar in places (clean HTML, fast pages, schema, semantic headings) but the writing is different. SEO often rewards a 2,400-word article with depth and entity coverage. AEO rewards a 60-word self-contained passage that an LLM can lift verbatim and attach to your domain.

FPWS treats AEO as a first-class engineering discipline, not a content garnish. Every page we ship has citable passages baked in, schema validated server-side, an entity graph that resolves cleanly, and a place inside the site's llms.txt. That is the baseline, not the upsell.

02

Why now, and why it compounds

AI search is not a future trend. Google AI Overviews already serve a measurable share of US informational queries, and ChatGPT's web-search mode and Perplexity together account for hundreds of millions of weekly answer impressions. AEO compounds because each citation increases the likelihood of being cited again: training data references, real-time retrieval, and brand-mention signals all reinforce each other. Starting in 2026 puts you ahead of the curve; starting in 2027 puts you behind it.

The shift is already measurable in Search Console data on tracked queries. Informational queries that used to send 35 to 45 percent click-through to the top result now send 18 to 28 percent on the same query in the same vertical, with the difference absorbed by AI Overviews. Commercial queries are slower to shift but moving in the same direction. We have client data showing a one-year decline of 22 percent in classical organic clicks for one mid-funnel query set, paired with a 312 percent rise in AI-citation appearances on the same queries.

The compounding effect is the part most agencies miss. AI models do not start from zero each query. They train on snapshots of the web, they retrieve from the live web, and they re-rank candidates based partly on prior brand familiarity. Get cited a few times for a query, and you become more likely to be cited next time. Get cited zero times for two years while a competitor gets cited weekly, and the gap is not linear. It is exponential.

The honest version: there is a window right now where the bar to entry is low, the technical work is well understood, and most competitors have not done it. That window is not permanent.

03

Passage-level citability, the actual writing rules

Passage-level citability means every H2 on a page is followed within 200 pixels by a 40 to 80 word answer block that is grammatically self-contained, names the entity explicitly, and could be lifted verbatim into an AI answer with no loss of meaning. This is the single highest-leverage AEO technique. It costs nothing in word count, improves classical SEO simultaneously, and is the structural feature LLMs reward most consistently in citation testing.

The discipline is simple to describe and hard to enforce. For every H2 on a page, write a paragraph (or short block) of 40 to 80 words that answers the implied question of the heading. The block has to start with the entity, not a pronoun. It has to define or answer fully on its own. It cannot reference other parts of the page (no "as we discussed above" or "see the next section"). And it has to read like a sentence a human wrote, not a SEO-bot template.

The TL;DR at the top of every long-form piece is the most important citable passage on the page. Engineer it as the answer to the headline query of the article. Sixty to eighty words, self-contained, entity-named. We have seen TL;DR blocks become the single most-cited passage on a domain within 90 days of publishing.

FAQ blocks are the second-most-cited surface. Phrase questions as questions (not statements), answer in a single paragraph, and render the matching FAQPage schema server-side. Do not stuff FAQs to inflate page length. Three to eight real questions per page beats fifteen filler questions every time.

  • Every H2 followed within 200px by a 40 to 80 word self-contained answer
  • TL;DR engineered as the primary citable passage at the top of long-form pieces
  • FAQ blocks rendered with FAQPage JSON-LD server-side
  • Question-form H2s where the underlying user query is a question
  • No marketing fluff inside the answer block, definitions and facts only
04

llms.txt, what it is and what to put in it

llms.txt is a plain-text file served at the root of a website (like robots.txt) that summarizes the site for AI assistants. It contains the site's purpose, its primary URLs grouped by intent, and explicit citation guidance. llms.txt is not a hard requirement, AI engines will still crawl a site without it, but it materially improves citation quality and frequency by giving models a clean, opinionated map of what the site is and which URLs to attribute answers to.

Think of llms.txt as the README of your site, written for an LLM. The format is loose: markdown-flavored, with a top-line description, sections for services and resources, and a citation-guidance block telling models how you want to be referenced. The file lives at /llms.txt at the site root and is served as text/plain.

FPWS publishes llms.txt for every client site by default. The structure we use: site name and one-paragraph description, services list with URLs grouped by intent, resources list (pillar guides, key articles), about and contact, and a citation-guidance block that includes the preferred brand name format and author attribution rules.

The mistake most agencies make is treating llms.txt as a sitemap dump. It is not a sitemap. It is a curated, opinionated summary. Include only the URLs that you want cited, in the order of importance. Skip thin pages, internal utility pages, and anything you would not want lifted into an AI answer.

05

Entity definition and schema, the foundation under everything

Entities are the nouns AI models attach citations to: organizations, people, products, places. Strong entity definition uses Schema.org JSON-LD (Organization, Person, Service, LocalBusiness, Product) with stable @id URIs and sameAs links to LinkedIn, GitHub, X, Wikidata, and Crunchbase. Cross-page references use the @id, not redefinitions. The result is that AI models recognize the brand as a coherent entity, not a string of words, which is the precondition for being cited reliably.

The technical pattern is simple but rarely done correctly. Define each canonical entity once, in a typed schema module (we use schema-dts in TypeScript for compile-time validation). Give each entity a stable @id URI like https://yourdomain.com/#organization. On every page, reference entities by @id rather than redefining them. The result is a clean entity graph that crawlers and LLMs can resolve into a single coherent picture.

Person schema for the founder or named author is the highest-leverage entity work for boutique businesses. Link the Person to the Organization via worksFor, link to LinkedIn, X, GitHub, and any author bylines on industry sites via sameAs, and reference that Person from every article they wrote. Within six months, AI assistants will start citing the named human alongside the brand.

For local businesses, the LocalBusiness subtype is the lever. Use the most specific subtype available (Dentist, LegalService, Restaurant, HomeAndConstructionBusiness, etc.), include geo coordinates, opening hours, accepted payment methods, and aggregate rating where defensible. Generic LocalBusiness with no subtype is leaving citations on the table.

06

AI crawler access, the policy decision most sites get wrong

AI crawler access is controlled in robots.txt with explicit allow or disallow directives for named bots: GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot (Perplexity), Google-Extended (Google AI surfaces), Applebot-Extended (Apple Intelligence), and CCBot (Common Crawl, used to train many open models). Blocking these bots blocks AEO. For any business that wants to be cited inside AI answers, the correct policy is allow, not block.

There is a real debate about whether to let AI crawlers train on your content. For publishers selling content directly, the calculus is genuinely complicated. For service businesses selling visibility and lead generation, it is not. If you want ChatGPT to cite you, you have to let GPTBot crawl you. If you want Perplexity to cite you, PerplexityBot has to be allowed. Blocking is opting out of AEO entirely.

The robots.txt block we ship by default for AEO clients allows GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, and CCBot, plus the standard search crawlers. We pair this with a normal sitemap reference. If a client has a specific reason to block a bot (legal, compliance, IP), we honor it, but we surface the cost: that bot will not cite the site.

One subtle point: Google-Extended controls only Google's AI surfaces (AI Overviews, Bard/Gemini training), not classical Google Search. Blocking Google-Extended does not affect classical rankings, but it does remove the site from AI Overviews. Many sites block it accidentally because they do not realize the distinction.

07

Citation monitoring, what to track and how

Citation monitoring is the monthly practice of measuring where a brand is cited across ChatGPT, Perplexity, Google AI Overviews, Bing Copilot, and Claude. The minimum viable stack: a fixed prompt panel of 30 to 60 target queries run manually each month, plus DataForSEO's ChatGPT scraper and AI-mention tracker for at-scale monitoring. Track citation count per platform, position within the answer, and sentiment. Without measurement, AEO is invisible work.

The honest answer is that citation monitoring is still a partly manual discipline. The tooling has improved (DataForSEO's ChatGPT scraper, Bing Webmaster Tools' AI traffic reporting, Search Console's Search Generative Experience reports where available) but no single dashboard yet covers all six major surfaces with full fidelity. Anyone telling you otherwise is selling something.

The FPWS workflow: a fixed prompt panel of 30 to 60 queries per client (commercial money queries, branded queries, and AEO citation-target queries), run on the first business day of each month across ChatGPT, Perplexity, Google AI Overviews, Claude, and Bing Copilot. Each citation logged with the prompting query, the cited URL, the position within the answer, and a sentiment tag. Month-over-month diffs surfaced in the monthly client report.

DataForSEO's ChatGPT scraper plus AI-mention tracking handles the at-scale layer (thousands of queries against ChatGPT, Perplexity, and Google AI Overviews). The manual panel handles the qualitative layer (sentiment, position, competitive context). You need both. Either alone misses too much.

08

Common mistakes, the failure patterns we see

The most common mistake is treating AEO as a content tactic instead of an engineering discipline. Writers add a few question-style headings to existing articles and call it AEO. The pages get a 2 percent lift in citation chance, not the 10x lift the structural work delivers. AEO done correctly changes the way the page is structured, not just the way it reads.

Second most common: blocking AI crawlers in robots.txt without realizing it. We have audited sites that spent six figures on AEO content while their robots.txt blocked GPTBot. The content was invisible to the engine they were trying to be cited by. Always check the bot policy first.

Third: schema bloat. Adding eight different schema types to every page in the hope that more is better. AI models penalize schema spam the same way Google does. Use the schema that matches the page's actual content type. Do not add Recipe schema to a service page because it might trigger rich snippets.

Fourth: writing TL;DR blocks that are not self-contained. "This guide covers everything you need to know about X" is not a citable passage, it is a sentence about the page. The TL;DR has to answer the headline query of the article in 60 to 80 words, on its own, with no setup.

09

Getting started, the practical first 30 days

Week one: audit. Pull robots.txt and verify which AI crawlers are allowed. Check whether llms.txt exists and whether it is meaningful. Run a baseline citation panel of 20 to 30 target queries across ChatGPT, Perplexity, and Google AI Overviews. Snapshot the entity graph: is there an Organization schema, a Person schema for the founder, sameAs links to LinkedIn?

Week two: foundations. Publish a real llms.txt at /llms.txt. Allow GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot in robots.txt. Ship Organization, Person, and WebSite schema with stable @id URIs. Validate in Google's Rich Results test and Schema.org validator.

Weeks three and four: rewrite the top five pages for citability. Add a TL;DR. Rewrite each H2 to be followed by a 40 to 80 word self-contained answer. Add a real FAQ section with FAQPage schema. Verify everything renders server-side, not via client-side JavaScript. Publish.

First citations typically appear within 4 to 8 weeks of shipping a properly structured page. The compounding phase, the "we get cited reliably" phase, takes 3 to 6 months of consistent publishing and monitoring. Faster than classical SEO. Not instant. Worth the investment.

Questions

Answered below.

  • Answer Engine Optimization is the practice of structuring web content so AI assistants like ChatGPT, Perplexity, Google AI Overviews, Claude, and Bing Copilot can extract and cite it cleanly inside their generated answers.

Want this work done for you?

Let's talk.