Key takeaways
- AI search engines retrieve far more pages than they cite. Across 548,534 pages analyzed by AirOps, 85% of retrieved pages were never cited in the final answer. Getting indexed is not the same as getting cited.
- Citation selection follows a five-stage pipeline: retrieval trigger, retrieval rank, heading match, content focus, and fan-out coverage. Most B2B SaaS content fails at a specific, identifiable stage.
- The “ultimate guide” strategy that dominated SEO for a decade is now a citation liability. AirOps found that pages covering 26% to 50% of fan-out sub-queries outperform pages covering 100%, and pages over 5,000 words underperform shorter, more focused pages.
- For B2B SaaS companies watching competitors appear in ChatGPT and Perplexity responses, the fix is not “rewrite everything.” The fix is diagnosing where the page fails between retrieval and citation, then repairing that specific failure mode.
Most B2B SaaS content gets retrieved by AI search engines but never cited because it fails at one of five stages in the retrieval-to-citation pipeline.
Two-thirds of relevant prompts never trigger web search at all, so the first question is whether retrieval is even happening.
For prompts that do trigger search, retrieval rank is the strongest predictor of citation. AirOps found that pages at retrieval position 1 earn a 58.4% citation rate, compared with 14.2% at position 10. Heading-to-query match also matters: pages with high heading similarity are cited 41% of the time versus 30.2% for weak matches. Pages covering 26% to 50% of fan-out sub-queries outperform pages covering 100%. Domain authority shows no positive correlation with AI citation.
Is this even a citation problem?
Before you start auditing headings or rewriting pages, ask a more basic question: is ChatGPT even searching the web for your topic?
According to Semrush’s analysis of 17 months of clickstream data, ChatGPT enabled web search on only 34.5% of queries as of February 2026. The other 65.5% were answered without live web retrieval. No web search means no retrieval. No retrieval means no citation opportunity.
This matters for B2B SaaS content teams because a large share of the content most companies produce covers established topics already baked into the model’s training data. “What is account-based marketing?” “How does a CDP work?” “What is revenue attribution?” If ChatGPT can answer the question from memory, your page is not in the running.
The queries that do trigger web search tend to be time-sensitive, comparative, or product-specific. For example: “Best CDP for mid-market SaaS in 2026,” “How does Segment compare to mParticle for event tracking?” or “Pricing for HubSpot Operations Hub.” These queries force the model to search because the answer changes, involves multiple products, or requires current data the model does not have.
If your content library is built around evergreen definitional topics, citation optimization is the wrong diagnostic. The first step is to sort your target queries into two buckets: queries where the model is likely to search, and queries where it probably will not. Everything that follows applies only to the first bucket.
Are your pages making it into the retrieval pool?
For queries that trigger web search, retrieval rank is the single strongest predictor of whether your page gets cited. According to AirOps’ Fan-Out Effect research, pages at retrieval position 1 earn a 58.4% citation rate. By position 10, that drops to 14.2%.
The good news: retrieval rank maps closely to Google search rank. AirOps’ earlier retrieval work, summarized by Kevin Indig in Growth Memo, found that pages with 50% or greater title-to-query overlap had a 20.1% citation rate versus 9.3% for pages with less than 10% overlap. In other words, rank matters, but alignment determines whether rank turns into citation.
This diagnostic step is simple. Pull up your target queries. Check where you rank in Google for each one. If you are outside the top 10, fix that before touching anything else. Heading optimization, content restructuring, and fan-out coverage will not matter much if the page is not making it into the retrieval pool.
Do your headings answer the actual query?
This is the most accessible fix in the entire diagnostic, and most B2B SaaS content teams have never audited for it.
Heading-to-query semantic match is one of the strongest on-page levers for AI citation. AirOps found that pages with high heading similarity, defined as 0.90 or above, are cited 41% of the time versus 30.2% for pages with weak matches below 0.50. Even when controlling for rank, higher heading match adds 19 percentage points to citation rate.
The problem I see in most B2B SaaS content libraries is that headings describe the topic rather than answer the query. A page titled “The Complete Guide to Revenue Attribution” with H2s like “Understanding the Attribution Landscape” and “Key Considerations for Your Attribution Strategy” is describing what the page covers. It is not answering the question a buyer typed into ChatGPT: “How do I set up multi-touch attribution for a B2B SaaS sales cycle?”
Here is a practical heading audit you can run this week. List your 20 highest-priority queries, meaning the queries you want AI citation for. Pull up each target page. Compare the H1 and every H2 against the exact query. Ask: does this heading answer the query, or does it talk around it?
A heading like “Setting up multi-touch attribution for B2B SaaS” is a direct match. “Understanding Attribution Models” is not. The gap between those two headings is the gap between a page that looks citable and a page that only looks topically related.
Is your “ultimate guide” actually hurting you?
This finding should make every SaaS content team nervous.
According to AirOps’ Fan-Out Effect research, pages covering 26% to 50% of fan-out sub-queries earn a 38.2% citation rate. Pages covering 100% of those sub-queries earn 34.0%. The page that tries to cover everything gets cited less often than the page that covers part of the topic well.
The same research found a related pattern at the subheading level. Among pages with strong primary heading match, pages whose subheadings spread across 3 to 4 different fan-out sub-queries are cited less often than pages whose subheadings match only 0 to 1 fan-out queries. The sample for the 3 to 4 bucket is smaller, so treat that finding as directional rather than definitive. But the pattern is consistent with the broader signal: when one page tries to answer too many adjacent queries, the primary match gets diluted.
On word count, the sweet spot is focused. Pages between 500 and 2,000 words perform well, while pages over 5,000 words underperform shorter pages. The old “ultimate guide” playbook made sense for traditional SEO. Longer pages accumulated more keyword matches, earned more backlinks, and generated more internal linking opportunities. Google often rewarded length. AI search engines do not reward length in the same way.
When ChatGPT retrieves your 8,000-word guide on revenue attribution, it has to scan the entire page to find the 200 words that answer the specific query. A focused 1,200-word page that answers that specific question with a matching heading wins because the signal is concentrated.
Kevin Indig’s research in Growth Memo adds the strategic context: citation dominance often happens at the domain level through breadth across focused pages, not at the page level through one massive guide. For B2B SaaS, that is good news. You do not need one giant guide to win the category. You need a network of focused pages, each built to answer one high-intent query.
That 6,000-word “Complete Guide to Revenue Attribution” should become five or six focused pages: one on multi-touch attribution setup, one on attribution for long sales cycles, one on comparing attribution models, one on attribution reporting for boards, and one on attribution data quality. Each page owns one query. Together, they own the category.
Are you visible on the fan-out sub-queries?
When someone asks ChatGPT, “What is the best CDP for mid-market SaaS?”, ChatGPT does not necessarily run one search. It breaks the question into component sub-queries: pricing comparisons, feature sets, integration capabilities, implementation timelines, and use-case fit.
AirOps found that 89.6% of prompts generated two or more fan-out queries, and 32.9% of cited pages were found only through these fan-out queries, not through the original prompt.
Here’s the kicker: 95% of fan-out queries had zero traditional search volume. If your content strategy is built only on keyword research, which for most B2B SaaS companies it is, you are invisible on a third of citation opportunities because those queries do not show up in the tools your team uses to plan content.
Finding the fan-out queries that matter is still partly manual. Start with three methods:
- Run your target queries through ChatGPT and watch the search activity. The sub-queries it generates are often visible in real time.
- Review Google’s “People Also Ask” expansions for your target queries. They are not identical to fan-out queries, but they are a useful starting point.
- Ask your sales team. The component-level follow-up questions prospects ask on calls often map closely to what AI engines fan out into: pricing by tier, integrations with specific tools, implementation timelines for specific company sizes, and migration risks.
Each fan-out sub-query is a candidate for a focused page, not a subsection buried inside a long guide. The domain that covers “CDP pricing for mid-market SaaS” and “CDP implementation timeline for 50-person teams” as separate focused pages will capture fan-out citations that the domain with one “Ultimate CDP Buyer’s Guide” will miss.
What to do when multiple stages fail
If you ran through the diagnostic above and only one stage came back red, the fix is straightforward. Rewrite headings. Unbundle a long guide. Create focused content for fan-out queries. Page-level fixes work for page-level problems.
If two or more stages failed across your content library, the problem is different. A team that produces 50 pages with mismatched headings did not make 50 independent mistakes. They followed a briefing process that does not account for query-heading alignment. A content library full of 5,000-word guides did not happen by accident. It followed an editorial strategy built for a different era of search.
Same team, same briefs, same templates, same review criteria. You get the same failure mode repeated across every page.
This is an editorial function gap, not a budget gap. The content team is producing exactly what it has been asked to produce. The briefs, templates, review criteria, and publishing cadence were designed for Google’s ranking signals, not AI’s citation behavior. Fixing one page at a time is possible but slow. The higher-return move is to change the briefing process, editorial criteria, and content model so every new page ships citation-ready by default.
For B2B SaaS and technology companies, where the content function is often one or two people running without senior editorial oversight, this is a common pattern. The content gets produced. The quality is often good. But structural decisions like heading construction, page scope, and topic selection reflect habits formed before AI search existed. A fractional head of content can close that gap: someone senior enough to redesign the editorial system, not just execute within the existing one.
The diagnostic is not difficult. Changing the system that creates the content is the harder, more valuable work.
Get your company cited by AI search
I help B2B SaaS and technology companies build AI search-ready content programs. If you want me to build yours, book a free consultation.
Commonly ask questions about AI search retrieval and citations
Why does my long-form guide rank on Google but not get cited by ChatGPT?
AirOps research found that pages covering 26% to 50% of fan-out sub-queries outperform pages covering 100%, and pages over 5,000 words underperform shorter, more focused pages. Google can still reward comprehensive content for ranking. ChatGPT rewards focused content for citation. A long guide can rank on Google and still lose individual citation battles to focused pages that answer one specific query well.
Does domain authority matter for AI citation?
No. AirOps found that domain authority shows no positive correlation with citation rate. ChatGPT evaluates whether a page answers the specific query and supporting sub-queries. Domain-level authority may help a page get discovered, but it does not guarantee citation.
How do I check if ChatGPT is searching for my topics?
Run your target queries directly in ChatGPT and watch whether search activates. If it answers immediately without searching, your topic is likely covered by training data and citation optimization will not help. Focus on queries that trigger web search: comparative queries, product-specific questions, and time-sensitive topics where the model needs current information.
Should I delete my long-form guides?
No. Keep the existing pages if they still perform in traditional search, but unbundle their subtopics into focused pages. A 6,000-word attribution guide can become five focused pages, each targeting one specific query with a matching heading. The original guide can link to those focused pages, and together they create more citation surface area than the single guide could.
How quickly do structural changes affect AI citation?
It depends on the engine. Retrieval-based engines like Perplexity can reflect changes faster because they pull fresh search results on each query. ChatGPT is less predictable because it combines web search with training data and does not trigger web search for every prompt. Expect weeks to months, not days.
How quickly do structural changes affect AI citation?
It depends on the engine. Retrieval-based engines like Perplexity can reflect changes faster because they pull fresh search results on each query. ChatGPT is less predictable because it combines web search with training data and does not trigger web search for every prompt. Expect weeks to months, not days.
Can I track which fan-out queries matter for my content?
Yes, though manual tracking is still the starting point for many teams. Run your priority queries through ChatGPT or Claude and note the sub-queries it generates. AEO monitoring tools like AirOps can automate this by tracking the fan-out queries associated with your target prompts over time. The key insight is that most fan-out queries have zero monthly search volume, so traditional keyword tools will miss them.



