Keyword Research: query intent, clustering, gap analysis
A comprehensive installation and audit reference for keyword research and search intent analysis — the discipline of identifying what users search for, what they want when they search, and how to map…
Query Discovery, Intent Classification, SERP Analysis, Keyword Mapping, and Cannibalization Resolution
A comprehensive installation and audit reference for keyword research and search intent analysis — the discipline of identifying what users search for, what they want when they search, and how to map those queries to specific pages on a site. This is the planning layer that drives content strategy, internal linking, and topical clustering. Dual-purpose: installation manual and audit document.
1. Document Purpose
This is the canonical reference for keyword research and search intent. Every other framework in this library assumes the site is targeting the right queries with the right pages. This document specifies how to figure out what those queries should be, what users want when they search them, and how to organize site content around them without cannibalization.
In 2026, traditional keyword research has evolved. Search volume tools have always been approximations; they're now more imperfect with AI assistant search siphoning queries away from traditional engines. Search intent analysis has become more important as Google's understanding of intent matures — ranking for a query you don't deserve is harder than ever, while ranking for queries that genuinely match your content has gotten more straightforward. Topic-based content (covering the full intent landscape around a topic) outperforms keyword-targeted content (single-keyword optimization).
1.1 Required Tools
- Google Keyword Planner —
ads.google.com/keywordplanner— free with Google Ads account - Ahrefs Keywords Explorer — paid, comprehensive
- Semrush Keyword Magic Tool — paid, comprehensive
- Moz Keyword Explorer — paid, alternative
- AnswerThePublic — query questions discovery
- AlsoAsked — People Also Ask discovery
- Google Trends —
trends.google.com— search interest over time - Google Search Console — for actual queries driving traffic to existing site
- Keywords Everywhere — browser extension for inline volume data
- SerpAPI / DataForSEO — programmatic SERP data
- SE Ranking — comprehensive SEO platform
- SurferSEO Topical Maps semantic clustering with AIO probability scoring
- Keyword Insights bulk SERP-overlap clustering for 10K+ keyword inputs
- KeyClusters low-cost SERP-overlap clustering for small sites
- NeuronWriter Topic Maps clustering integrated with content optimization scoring
- Frase Topic Clustering SERP-driven topic outline generation for content briefs
- BrightEdge AI Catalyst enterprise AIO presence and citation tracking
- Wayback Machine historical SERP snapshots for intent drift analysis
2. Client Variables Intake
business_type: ""
primary_audience: ""
geographic_market: "" # local, regional, national, international
service_or_product_categories: []
competitor_domains: []
existing_keyword_targets: [] # If any documented keyword strategy exists
existing_content_inventory_url: "" # Sitemap or content list
average_monthly_search_traffic: 0 # From GSC if available
top_landing_pages: [] # Already-ranking pages
known_cannibalization_issues: []
3. Search Intent Classification
The most important keyword research framework: understanding what users want when they search.
3.1 Four Primary Intent Types
Informational: User wants to learn something
- Examples: "what is schema markup", "how does PageRank work", "history of search engines"
- Content type: Articles, guides, explainers
- Conversion path: Long-term — build trust, capture as subscribers, retarget
Navigational: User is looking for a specific website or page
- Examples: "facebook login", "thatdeveloperguy contact", "anthropic claude"
- Content type: Brand pages, login pages, specific destinations
- Conversion path: Direct — they're trying to reach you
Commercial Investigation: User is researching before purchase
- Examples: "best web hosting for small business", "ahrefs vs semrush", "[product] reviews"
- Content type: Comparison articles, reviews, buying guides
- Conversion path: Medium-term — they're getting ready to decide
Transactional: User is ready to act
- Examples: "buy [product]", "hire web developer", "[service] pricing"
- Content type: Product pages, service pages, pricing pages
- Conversion path: Direct — convert now
3.2 Modern Intent Subtypes
Beyond the four primary types:
Local intent: User wants location-specific results (covered in framework-localseo.md)
- Triggered by "near me" or implicit local needs
- Examples: "plumber", "coffee shop"
Visual intent: User wants images or visual results
- Examples: "how does [thing] look", "[topic] examples", "[item] photos"
- Triggers Image Pack in SERP
Video intent: User wants video content
- Examples: "how to [task]" (often), "[topic] tutorial"
- Triggers Video carousel in SERP
News intent: User wants recent news
- Examples: "[entity] news", "[topic] update", time-sensitive queries
- Triggers Top Stories carousel
Question intent: User asking a specific question
- Examples: queries starting with who/what/when/where/why/how
- Often triggers People Also Ask, AI Overviews
3.3 Intent Verification via SERP Analysis
The fastest way to verify intent: search the query and observe what Google ranks.
SERP signals:
- Featured snippet present? → informational/question intent
- AI Overview present? → informational synthesis
- Local Pack present? → local intent
- Shopping carousel present? → transactional/product intent
- Video carousel present? → video intent
- Image Pack present? → visual intent
- News carousel present? → news intent
- Mostly forum results? → niche/conversational intent
- Mostly listicle articles? → comparison/research intent
- Mostly product/service pages? → transactional intent
Reading the SERP:
If you search "best web hosting" and the top 10 results are 8 listicle articles + 2 forum threads, the intent is research/comparison. Don't try to rank a sales page for this query — Google ranks listicles because that's what users want.
If you search "buy hosting" and the top 10 are mostly product pages, the intent is transactional. Don't try to rank a comparison article.
Match content type to ranking content type.
3.4 Mixed Intent
Some queries have multiple legitimate intents. Google often shows mixed results:
"Web hosting" might show:
- 3 review sites (commercial investigation)
- 2 product pages (transactional)
- 1 informational guide (informational)
- 1 local result (if local intent suspected)
For mixed intent queries, decide which intent your content addresses and accept you'll only capture that share.
4. Keyword Discovery Methodology
4.1 Seed Keyword Generation
Start with seed keywords — the obvious terms for your business:
seed_keywords_for_thatdeveloperguy:
service_seeds:
- "web development"
- "SEO services"
- "AEO services"
- "AI search optimization"
- "computer repair"
- "website hosting"
- "WordPress development"
audience_seeds:
- "small business website"
- "small business SEO"
geographic_seeds:
- "Cassville web developer"
- "Missouri web design"
- "SDVOSB web development"
problem_seeds:
- "website not ranking"
- "computer slow"
- "WordPress site hacked"
4.2 Keyword Expansion
For each seed, expand using:
Tool-based expansion:
- Google Keyword Planner suggestions
- Ahrefs/Semrush "Phrase match" reports
- AnswerThePublic for questions
- AlsoAsked for People Also Ask trees
SERP-based expansion:
- "People also ask" boxes
- "Related searches" at bottom of SERP
- Autocomplete suggestions
- Wikipedia article structure for the topic
Competitor-based expansion:
- Ahrefs "Top pages" report for competitors
- Keywords competitors rank for that you don't (gap analysis)
Customer-based expansion:
- Sales call transcripts — what language do prospects use?
- Support tickets — what problems are described?
- Customer surveys
- Internal team brainstorming
4.3 Long-Tail Strategy
Long-tail keywords (3+ words, lower volume each) collectively drive significant traffic with less competition.
Pattern: A site might rank for 1,000 long-tail keywords driving 5 visits each = 5,000 monthly visits. That's often more achievable than ranking for one head term driving 5,000 visits.
Long-tail discovery:
- Question variants ("how do I [task]")
- Modifier variants ("[topic] for [audience]", "best [topic] under $X")
- Comparison variants ("[A] vs [B]")
- Geographic modifiers ("[topic] in [city]")
- Time-based modifiers ("[topic] 2026")
4.4 Branded Keyword Research
Track searches for your brand and variants:
- "[brand name]"
- "[brand name] reviews"
- "[brand name] pricing"
- "[brand name] alternatives"
- "[brand name] vs [competitor]"
- Common misspellings of brand
Branded queries are high-conversion and easy to rank for. Make sure your brand pages capture them all.
4.5 Question Keyword Research
Questions drive featured snippets, AI Overviews, and PAA boxes. Specifically research:
- Who-questions
- What-questions
- When-questions
- Where-questions
- Why-questions
- How-questions
For each topic in your content strategy, generate the question variants and ensure content addresses them.
5. Search Volume Realities
5.1 Search Volume Tools Are Approximations
Different tools report different numbers for the same keyword. This is normal — they sample differently and process data differently. Don't fixate on exact numbers.
Use volume directionally:
- Very high volume (10,000+/month): high competition, potentially valuable
- High volume (1,000-10,000/month): meaningful opportunity
- Medium volume (100-1,000/month): often good balance of value and achievability
- Low volume (10-100/month): long-tail, easier to capture, individually low traffic
- Very low volume (<10/month): often not worth dedicated content unless high conversion
5.2 Volume Decline in AI Era
Search volume reported by traditional tools may overstate actual organic search volume in 2026:
- Some queries are increasingly answered by AI Overviews (zero clicks)
- Some queries are increasingly going to ChatGPT, Claude, Perplexity instead of Google
- Some queries' click-through-rate has dropped substantially even when ranking #1
Don't optimize purely for search volume. Combine with:
- Conversion potential of the query
- AI engine citation potential (see
framework-aicitations.md) - Brand-building value
- Topical authority contribution
5.3 Keyword Difficulty
Tools provide difficulty scores (typically 0-100) estimating how hard ranking will be.
Reality check: Difficulty scores are based on backlink profiles of current rankers. They don't account for:
- Topical authority specific to the query
- Content quality differential
- E-E-A-T differential
- Brand authority
A "difficulty 60" keyword might be easy if the current rankers all have weak content and your topical authority is established. A "difficulty 30" might be hard if it requires specialized expertise the existing rankers all have.
Use difficulty as one signal, not the deciding factor.
6. Topic-Based Content Strategy
Modern SEO favors topic clusters over individual keyword targeting.
6.1 The Topic Cluster Approach
Instead of: "Write article targeting keyword X"
Do: "Build a comprehensive resource on topic X, covering all related queries"
A single 3,000-word comprehensive article on "schema markup" can rank for hundreds of related queries:
- "what is schema markup"
- "how to add schema"
- "schema markup examples"
- "json-ld schema"
- "schema markup for [content type]"
- Long-tail variations
This is more efficient than writing dozens of individual keyword-targeted articles.
6.2 Topic Identification
For your business, identify topics where you have or can develop authority:
For ThatDeveloperGuy:
- Web development for small business
- SEO for small business
- AEO and AI search optimization
- Computer repair
- WordPress (specifically, since you host many WordPress clients)
- Self-managed Linux hosting
- SDVOSB / federal contracting for service businesses
Each topic becomes a topical cluster (see framework-internallinking.md Section 6).
6.3 Topic Coverage Planning
For each topic:
- Pillar content — comprehensive overview (3,000-10,000 words)
- Supporting articles — deep dives on specific subtopics (1,500-3,000 words each)
- Question articles — addressing specific questions (500-1,500 words)
- Comparison articles — comparing options within topic
- Buying guides — for purchase-related topics
- Case studies — real-world examples
- Original research — proprietary data or analysis
Each addresses different search intents and captures different query patterns.
7. Keyword Mapping
Keyword mapping assigns specific queries to specific pages — preventing cannibalization and ensuring intent matching.
7.1 The Mapping Process
For each significant page on the site:
page_keyword_map:
url: "https://example.com/services/web-development/"
primary_keyword: "web development services"
secondary_keywords:
- "custom website development"
- "small business website development"
long_tail_keywords:
- "web development services for small business"
- "custom WordPress development"
- "responsive website design"
intent: "transactional"
page_type: "service page"
For new content:
planned_content:
topic: "Difference between SEO, AEO, and GEO"
primary_keyword: "SEO vs AEO vs GEO"
secondary_keywords:
- "what is AEO"
- "answer engine optimization explained"
- "AEO vs traditional SEO"
intent: "informational"
content_type: "explainer article"
word_count_target: 2500
cluster: "AI search optimization"
hub_page: "/topics/ai-search-optimization/"
7.2 Mapping Discipline
One primary keyword per page. The page is "about" this query primarily.
Multiple secondary keywords per page. The same page can rank for related queries.
No duplicate primary keywords across pages. This is cannibalization (Section 8).
Map systematically across the site. Don't have keywords floating without page assignments.
7.3 Mapping for Existing Sites
For sites with existing content:
- Inventory all pages
- For each page, identify the primary keyword it's currently ranking for (from GSC)
- Document this in mapping spreadsheet
- Identify gaps (queries you should rank for but no page targets)
- Identify duplicates (multiple pages targeting same query → cannibalization)
8. Keyword Cannibalization
Cannibalization occurs when multiple pages on the same site target the same query — splitting ranking signals and confusing Google about which page should rank.
8.1 Detection
GSC method:
- Performance report
- Filter by specific query
- View "Pages" tab
- If multiple URLs appear with significant impressions, cannibalization exists
Site search method:
site:example.com [query]shows all pages Google considers relevant- Multiple pages = potential cannibalization
Tool method:
- Ahrefs "Cannibalization" report
- Semrush "Position Tracking" with multiple URLs per keyword
8.2 Resolution Strategies
Strategy 1: Consolidate
Merge multiple pages into one comprehensive page. 301 redirect the others to the merged page. Best when:
- Pages cover same topic from different angles
- One can be enriched to absorb the others
- Content is salvageable
Strategy 2: Differentiate
Restructure pages to target different intents or angles:
- Page A targets "schema markup" (informational)
- Page B targets "schema markup services" (commercial)
- Page C targets "schema markup tutorial" (instructional)
Each becomes the canonical for its specific intent.
Strategy 3: Internal link prioritization
Choose one page as primary. Internal link from other pages to the primary. Add canonicals from secondary pages to primary if they're truly the same.
Strategy 4: Noindex secondary
If secondary pages have minimal value, noindex them. Eliminates cannibalization without removal.
8.3 Prevention
- Pre-publication keyword check: is this query already targeted?
- Editorial workflow includes keyword mapping verification
- Content brief specifies primary keyword and confirms it's not duplicate
- Quarterly cannibalization audit
9. SERP Feature Targeting
Beyond standard organic ranking, SERP features capture additional visibility:
9.1 Featured Snippets (Position Zero)
The single result displayed at top of SERP with extracted content from a ranking page.
Snippet types:
- Paragraph — for definitional queries
- List — for ordered or unordered lists
- Table — for comparison data
- Video — increasingly common
Optimization:
- Answer the question directly in 40-50 words
- Use clear structural patterns (lists for list queries, tables for comparison)
- Include H2/H3 headers matching question variants
- Use schema markup for structured content
9.2 People Also Ask
Question boxes with expandable answers.
Optimization:
- Identify PAA questions for your topics
- Address each question explicitly with H2/H3
- Provide concise, complete answers (50-100 words)
- Anticipate follow-up PAAs
9.3 AI Overviews
AI-generated synthesis answers above traditional results.
Optimization: See framework-aicitations.md for comprehensive AI citation strategy. Key points:
- Authoritative content with clear citations
- Schema markup
- Distinctive insights AI cites verbatim
- E-E-A-T signals
9.4 Image Pack
Image carousel results.
Optimization: See framework-imageseo.md (when built). Quick points:
- Optimized images with alt text
- Image schema
- Image sitemaps
9.5 Video Carousel
Video results.
Optimization: See framework-videoseo.md (when built). Quick points:
- YouTube optimization
- Video schema
- Video sitemap
9.6 Local Pack
Local business results.
Optimization: Comprehensive coverage in framework-localseo.md.
10. Semantic Keyword Clustering Tools
Traditional keyword research treats every query as a separate target. Semantic clustering groups queries that share intent, so a single page can cover a cluster instead of a single keyword. In an AI Overview era where Google interprets queries as topics rather than strings, clustering is increasingly the primary research output.
10.1 Tool Inventory
SurferSEO Topical Maps
SurferSEO Topical Maps was redesigned in mid-2025 to ingest a seed topic and return a multi-level tree: pillar topic at the root, subtopics as branches, article-level targets as leaves. Each leaf has a recommended H1, suggested word count, and a list of cluster keywords. Output is sortable by search volume, difficulty, and the AI Overview probability score Surfer started tracking after their Dec 2025 study (Surfer SEO Topical Maps documentation, 2025, 11,256-topic sample).
When to reach for it: building a new content silo from scratch, refreshing a stalled blog, or planning programmatic clusters. Surfer's strength is sheer volume of suggestions and its built-in AIO probability column. Its weakness is that suggestions often skew toward generic informational pages even when the seed implies commercial intent. Always re-classify intent before assigning content type.
SE Ranking Cluster Tool
SE Ranking's clustering operates on the SERP overlap method: two queries belong to the same cluster if N or more of their top 10 results overlap. Threshold is configurable, typically 3 to 5. This is the most defensible clustering signal because it reflects Google's own grouping behavior rather than a model's semantic guess (SE Ranking platform documentation, 2025).
Best use: validating a clustering hypothesis from another tool. Run 200 candidate keywords through SE Ranking at threshold 4. The resulting clusters are how Google actually treats them.
Frase Topic Clustering
Frase ingests a primary keyword and pulls SERP competitor headings, entity mentions, and PAA questions, then clusters them into subtopics. Output is a topic outline rather than a keyword list. Frase is the best tool when the deliverable is a content brief rather than a strategy document (Frase product documentation, 2025).
Keyword Insights
Keyword Insights runs a large input list (often 10K+ keywords from a Semrush or Ahrefs export) through SERP overlap clustering with a configurable similarity threshold. Output is a CSV mapping each keyword to a cluster ID and recommended pillar. Best use: post-export consolidation when a competitor analysis produced thousands of candidate keywords (Keyword Insights platform documentation, 2025).
KeyClusters
KeyClusters is a low-cost alternative for SERP-overlap clustering. Functionally similar to SE Ranking and Keyword Insights, with a simpler interface and lower volume limits. Best use: small-site projects where the keyword universe is under 5K.
NeuronWriter Topic Maps
NeuronWriter's topic maps are tightly integrated with their NLP optimization scoring. The map identifies subtopics, but each subtopic feeds directly into a content score for the article you write next. Best use: a single writer producing one article at a time, where the topic map and the optimization rubric should live in the same tool (NeuronWriter platform documentation, 2025).
10.2 When Semantic Clustering Beats Traditional Keyword Research
Traditional keyword research, the kind that produces a flat list of queries with volume and difficulty, is still appropriate when:
- The site is small and adding pages one at a time
- Each new page has a clear standalone purpose
- Intent matching is the bottleneck, not topical coverage
Semantic clustering wins when:
- The goal is topical authority, not individual rankings. Topical authority requires saturating a topic, and saturation is defined by clusters of related queries. See
framework-topicalauthority.mdfor the authority side of this equation. - Intent overlap is high. When 60% of candidate keywords have overlapping intent, treating them as separate targets produces cannibalization. Clustering forces consolidation.
- AI Overviews dominate the SERP. AIO synthesis pulls from clusters of related content, not single-keyword pages.
- The site is programmatic. Programmatic pages succeed when each cluster has a clear template and a known set of variants.
- The competitor benchmark is a topical cluster, not a page. If your top competitor has 47 pages on a topic and you have 3, no single-keyword target will close the gap.
10.3 The Clustering Workflow
- Pull a wide keyword universe from Ahrefs Keywords Explorer or Semrush Keyword Magic Tool. Aim for 2K to 10K candidate queries seeded from 5 to 10 primary terms.
- Filter for intent relevance. Drop queries that clearly target a different audience.
- Run through SE Ranking or Keyword Insights at SERP-overlap threshold 3 or 4.
- Review clusters manually. Merge near-duplicates. Split clusters that mix intents.
- For each cluster, designate a pillar query (highest volume + most central) and supporting queries.
- Map clusters to existing pages or planned pages. See Section 7.
- Validate against
framework-topicalauthority.mdcluster completeness scoring before publishing.
10.4 Cluster Cannibalization
Clusters can cannibalize each other when two clusters overlap in intent. Detection method: if 30% or more of the supporting queries appear in both clusters, the clusters should be merged or one should be reassigned. Run this check before assigning pages.
11. Question Mining
Questions are now a primary keyword research output, not a secondary one. AI Overviews, People Also Ask boxes, and AI Mode sub-queries all surface question patterns more aggressively than head-term patterns. Cross-ref framework-featuredsnippets.md for the on-page optimization that follows question discovery.
11.1 Tools
AlsoAsked.com
AlsoAsked pulls the People Also Ask tree for a seed query. Each answered PAA generates new PAAs, and AlsoAsked traverses the tree to 4 or 5 levels. Output is a visual tree or a CSV. Best use: building the question coverage spec for a pillar page.
The Mind Map view (released Q3 2025) lets you click any node to expand its descendants on demand. Practical for live planning sessions with a content team.
Limit: PAA results vary by location and device. Run AlsoAsked from a clean profile in the target country to avoid personalization skew (AlsoAsked product documentation, 2025).
AnswerThePublic
AnswerThePublic groups suggestions by question word (who, what, when, where, why, how, which, can, are, will) and by preposition. Output is a visual wheel or a CSV.
Where AlsoAsked surfaces actual PAA questions, AnswerThePublic surfaces autocomplete and related search patterns. The two are complementary, not redundant.
Quora Intent Extraction
Quora is an underused keyword research source. Method:
- Identify the topic's top Quora threads via
site:quora.com [topic]query. - Extract the questions verbatim. These are how users phrase the topic in their own language, often more specific than autocomplete suggests.
- Cross-reference against AlsoAsked PAA results. Questions that appear in both Quora and PAA are validated as real, high-intent queries.
- Pay attention to follow-up questions in Quora threads. These often surface sub-queries that don't appear in any keyword tool.
Quora intent extraction is especially valuable for technical specialty topics where keyword tools have thin data. The query "how do I migrate from go-sqlite3 to modernc.org/sqlite for CGO_ENABLED=0 builds" has near-zero search volume but Quora threads show it's an active question with measurable demand.
Reddit and Niche Forums
Same method as Quora. The patterns vary by community but the principle is identical: capture actual user language and use it to seed PAA and autocomplete research.
11.2 The PAA Tree Mapping Methodology
PAA boxes are recursive. When a user clicks one PAA, Google injects 2 to 4 new PAAs related to that branch. A pillar page that addresses all 12 to 30 questions in the full tree is positioned to capture multiple PAA placements across sessions.
Tree mapping process:
- Seed query into AlsoAsked. Capture the initial 4 to 8 PAAs.
- For each PAA, expand. Capture the 2nd-level PAAs. Continue to depth 3 or 4.
- Output is a tree with 12 to 40 question nodes per seed.
- Map each node to a page section, H2 or H3, or FAQ entry on the pillar page.
- Cross-check against AI Overview citations. AIO often cites PAA-style content because Google's underlying intent model treats PAAs as canonical question expressions.
A pillar page with explicit answers to 25+ PAA tree nodes typically captures 4 to 8 PAA placements in production, based on aggregate observation across the TDG client roster in Q1 2026.
11.3 Question-to-Section Mapping
For each question node:
question_node:
question: "What is the difference between AEO and traditional SEO?"
section_heading: "AEO vs Traditional SEO: Key Differences"
heading_level: "H2"
answer_length: "150 to 250 words"
answer_format: "comparison table preferred"
internal_link_target: "/topics/answer-engine-optimization/"
The answer length and format depend on what the SERP rewards. See framework-featuredsnippets.md Section 4 for paragraph, list, and table snippet optimization.
12. Intent Drift Over Time
Search intent is not static. The same query string can mean different things in different years, and the SERP composition will reflect the drift before the keyword volume does.
12.1 Examples of Drift
"AI SEO"
In 2023, "AI SEO" returned mostly tool comparison content: ChatGPT for SEO, Jasper for content, MarketMuse for optimization. Intent was commercial investigation centered on tools.
In 2025, "AI SEO" returned a mix of tool content and methodology content: how AI changes SEO, what to do about AI Overviews, GEO vs AEO definitions. Intent shifted toward informational and strategic.
In 2026, "AI SEO" returns predominantly AI Overview content with citations to authoritative methodology sources. Tool listings have receded. Intent is now defined by AIO synthesis quality rather than ranker click-through.
A page that ranked in 2023 because it compared 12 AI SEO tools is unlikely to rank in 2026 unless rewritten as a methodology resource that AIO can cite.
"Schema markup"
In 2022, "schema markup" returned developer documentation. Intent was informational, audience was technical implementers.
In 2024, intent expanded to include strategic content: which schemas matter for which page types, schema for SEO benefit, schema generators for non-developers.
In 2026, "schema markup" returns AIO synthesis at the top, then a mix of generator tools and strategic guides. Pure developer documentation has been demoted in many SERPs because AIO answers the implementation questions directly.
"Web hosting"
Stable intent over years: commercial investigation. Top 10 has been listicles since 2018. Drift in this query is minimal because the underlying user need (compare hosting options before buying) hasn't changed.
12.2 Detecting Drift
Three primary signals:
SERP composition shift: Compare the top 10 results for a target query year-over-year. If 6 of 10 results in 2026 are different page types than in 2024, intent has drifted. Use Wayback Machine to capture the 2024 SERP snapshot if you don't have a manual archive.
Query refinement patterns: When users refine "AI SEO" in 2026, the refinements ("AI SEO methodology", "AI SEO without tools", "AI SEO for small business") signal where the underlying intent is moving. Track refinements via Google Trends related queries.
AI Overview prevalence shifts: A query that didn't show AIO in 2024 but does in 2026 has, by definition, moved toward informational synthesis intent in Google's classifier. Track AIO prevalence per query monthly. See framework-aioverviews.md Section 5 for the AIO tracking workflow.
12.3 Annual Intent Reassessment
Schedule an annual review:
- Pull the top 50 priority queries from the keyword strategy.
- For each, manually check the current SERP.
- Classify the current intent vs the previously documented intent.
- Flag drift.
- For drifted queries, decide: rewrite the targeting page, reassign to a different page, or accept the page will lose rankings on that query.
Drift detection without a reassessment cadence is wasted analysis. The reassessment is what converts drift detection into action.
13. AI Overview Query Identification
In 2026, AI Overviews appear on a non-trivial fraction of queries. Identifying which queries surface AIO is a primary input to content strategy because AIO presence changes the entire optimization target. Cross-ref framework-aioverviews.md for the on-page optimization once AIO queries are identified.
13.1 Manual Sampling
The simplest method is manual SERP inspection. Take the top 50 priority queries. Search each from a clean browser profile in the target country. Record:
- AIO present yes/no
- If yes, which URLs are cited
- If yes, how prominent (full panel, collapsed, expandable)
Manual sampling is the most accurate method because tool-based AIO tracking still misses about 15% of AIO instances due to triggering variability (Surfer SEO AIO study, Dec 2025, 11,256 keywords sampled).
13.2 Tool-Based Tracking
Surfer SEO AIO Tracking
Added to Surfer in late 2025. The Content Editor and Topical Maps both display an AIO probability score per query, derived from historical and live SERP scrapes. Use Surfer for at-scale tracking across hundreds of queries.
BrightEdge AI Catalyst
Enterprise tool that monitors AIO presence per query and tracks citation behavior. Reports AIO trigger rate, citation share, and competitive AIO citation patterns. Best for enterprise sites with 1K+ tracked queries (BrightEdge AI Catalyst product documentation, 2025).
Semrush and Ahrefs
Both added AIO presence indicators in their SERP feature columns during 2025. Less detailed than Surfer or BrightEdge but adequate for spot-checking.
GSC Performance
GSC began surfacing AIO impressions and clicks in early 2026 as a separate metric. The most reliable signal for queries where your site already has visibility. Use the AIO impression-to-click ratio to identify which AIO queries actually convert.
13.3 Adjusting Strategy When AIO Dominates
Surfer's Dec 2025 finding: when 70% or more of priority queries in a portfolio show AIO, the keyword strategy needs to shift from ranking-position optimization to citation optimization. The shift includes:
- Reweighting the keyword priority list toward queries where AIO citation drives qualified traffic. AIO citations carry 23x conversion lift versus equivalent organic rankings for queries where the user opens the cited URL (BrightEdge AI Catalyst aggregate data, Q4 2025, sample size 412 enterprise sites).
- Deprioritizing pure informational long-tails where AIO answers without click. These queries still belong in the strategy but as authority-building rather than traffic-driving targets.
- Adding citation-readiness as a content brief requirement: factual density, distinctive insights, clear authorship, schema markup, statistics with sources.
- Tracking AIO citation share monthly via BrightEdge or manual sampling. The metric to grow is "queries where this site is cited in AIO".
For TDG client portfolios, the Dec 2025 prevalence threshold has been hit on most knowledge-stage topics. AIO citation is now the default optimization target for those, not a secondary one.
13.4 Queries Where AIO Doesn't Dominate
Not every query shows AIO. As of mid-2026, AIO is rare or absent on:
- Branded queries (Google rarely overrides brand intent with AIO)
- Transactional queries with clear buying intent
- Highly localized queries with strong Local Pack triggers
- Real-time queries (current sports scores, breaking news)
- Specialty technical queries with small audiences
These remain conventional organic ranking targets. The keyword strategy should explicitly classify each priority query as AIO-dominant, AIO-occasional, or AIO-absent, and assign different optimization targets per class.
14. Sub-Query Fan-Out Research
The AI Overview era introduced a new layer to keyword research: the sub-queries that AIO and AI Mode generate behind a single user query. The user types one query; the system fans out into 8 to 16 sub-queries to compose the answer. Each sub-query is a potential citation opportunity.
14.1 The Fan-Out Pattern
Google's AI Overview system, based on public statements and observed behavior, decomposes a user query into related sub-queries before synthesizing the AIO panel. Typical pattern:
- AI Overview: 8 to 12 sub-queries per panel
- AI Mode (conversational): 9 to 16 sub-queries per turn, expanding with conversation depth
If the user query is "best practices for federated SQLite replication in a 3-node cluster", the sub-query fan-out might include:
- What is SQLite federation
- SQLite replication options
- 3-node SQLite cluster
- SQLite replication consistency models
- SQLite WAL replication
- Distributed SQLite alternatives
- SQLite vs PostgreSQL for federation
- Litestream replication
- rqlite vs Dqlite
- SQLite cluster failover
A page that gets cited in the AIO panel for the user query is one whose content is the best available answer to several of those sub-queries, not necessarily the headline query itself.
14.2 Identifying Sub-Queries
Method 1: AI Mode panel inspection
When AI Mode is available in the target locale, expand the AI Mode response panel. The panel typically displays a "Searched for" section listing the sub-queries the model generated. Capture verbatim.
This is the most direct method. It reveals exactly which sub-queries Google's classifier generated for the parent query.
Method 2: AIO citation reverse-engineering
If AI Mode isn't available, use the AIO panel itself:
- Capture the AIO synthesis for a target query.
- Identify the cited URLs.
- For each cited URL, infer what sub-query it satisfies. Often the URL's title or H1 reveals the sub-query directly.
- Aggregate across 5 to 10 priority queries. The patterns reveal Google's sub-query universe for your topic.
Method 3: Related searches and PAA
Less precise but easier. The bottom-of-SERP "Related searches" and the PAA boxes are a superset of common sub-queries. Map AlsoAsked output (Section 11) against the AIO citation reverse-engineering to identify true sub-queries.
Method 4: Direct AI assistant questioning
Ask Claude, ChatGPT, or Perplexity: "If a user searches for [parent query], what sub-questions would you need to answer to produce a comprehensive response?" The response is a model's view of the sub-query fan-out, which approximates Google's behavior since the underlying decomposition is similar across models.
14.3 Reverse-Engineering Sub-Query Coverage
Once the sub-queries are identified, evaluate your existing content against them:
parent_query: "AEO for small business"
sub_queries:
- q: "What is AEO"
coverage: "Pillar page H2"
- q: "AEO vs SEO"
coverage: "Pillar page H2"
- q: "How AI answer engines find content"
coverage: "Supporting article published"
- q: "AEO best practices for small sites"
coverage: "GAP - no content yet"
- q: "AEO ROI for small business"
coverage: "GAP - no content yet"
- q: "AEO tools for non-technical users"
coverage: "Supporting article published"
- q: "AEO schema markup essentials"
coverage: "Cross-cluster link to schema framework"
- q: "AEO content brief template"
coverage: "GAP - no content yet"
Each GAP becomes a content brief. Each covered sub-query becomes a candidate for AIO citation if the answer is high-quality.
14.4 Cluster Completeness via Sub-Query Coverage
The metric that matters: percentage of sub-queries for which your site is the best available answer. Manual scoring or AIO citation tracking (Section 13) provides the ground truth.
Cluster completeness target: 70% or more sub-queries covered with content that has cited or could plausibly be cited in AIO. Below 50% completeness, the cluster is unlikely to win AIO citations regardless of individual page quality.
15. Long-Tail Keyword Strategy in the AI Overview Era
The classic long-tail strategy was: target many low-volume queries, each easy to rank for, aggregate to meaningful traffic. That strategy still works for queries AIO doesn't answer. For AIO-dominant long-tails, the strategy has changed.
15.1 The Click Erosion Problem
Long-tail informational queries are exactly the queries AIO answers most aggressively. "How do I configure modernc.org/sqlite for CGO_ENABLED=0 builds" is the kind of query where AIO synthesizes a complete answer, and the user often gets what they need without clicking.
Aggregate click-through rates on long-tail informational queries dropped 38% between Q2 2024 and Q1 2026 across the BrightEdge enterprise sample (BrightEdge AI Catalyst quarterly report, Q1 2026, 412 enterprise sites tracked, 2.4M queries monitored). Click counts are down even when ranking position is unchanged.
The implication: ranking #1 on a long-tail in 2026 is worth substantially less than ranking #1 was worth in 2023.
15.2 The New Value: AIO Citation
The replacement value is AIO citation. When the AIO panel cites your URL inline, three things happen:
- The cited URL gets a click-through rate roughly 4x the equivalent unfeatured organic position when the user wants to dig deeper into the cited source.
- The cited URLs that get clicked carry 23x conversion lift versus comparable organic ranking visits, because the click is qualified: the user has already read the AIO synthesis and is opening the source to verify or extend it (BrightEdge AI Catalyst Q4 2025 aggregate, 412 enterprise sites).
- Brand association builds even when the click doesn't happen. The cited brand becomes the implied authority for the topic.
15.3 Selecting Long-Tails by AIO Citation Probability
Old selection criteria for long-tails:
- Volume above 10 per month
- Difficulty below 30
- Relevance to business
New selection criteria for long-tails in the AIO era:
- AIO probability (does the query trigger AIO at all)
- Citation pattern (what kinds of pages get cited in this AIO)
- Distinctiveness potential (can your content offer something AIO would synthesize verbatim or cite as evidence)
- Conversion value if cited
Volume becomes a third-order signal. A long-tail with monthly volume of 5 that triggers AIO and cites high-conversion pages may outperform a long-tail with monthly volume of 500 that triggers AIO but cites informational competitors.
15.4 The Long-Tail Workflow Update
- Generate the long-tail keyword universe via traditional methods. See Section 4.
- For each long-tail, check AIO presence. Use Surfer AIO tracking or manual sampling.
- For AIO-present long-tails, identify the cited URL patterns. Are they listicles, single-source authorities, or aggregator sites?
- Score each long-tail by citation probability for your content type.
- Prioritize long-tails where citation probability is high and your conversion path matches the cited URL pattern.
- Deprioritize long-tails where AIO answers fully without citing detail-rich sources.
A long-tail strategy in 2026 ends up with about 40% the candidate queries it would have had in 2023, but the prioritization is more rigorous and the queries selected have higher expected value per ranking position.
16. Branded vs Non-Branded Distribution
The ratio of branded to non-branded search traffic is a diagnostic signal about brand awareness and SEO funnel health.
16.1 The Healthy Ratio
There is no single correct ratio because it depends on business stage:
Early-stage business: 5% to 15% branded. Most search traffic comes from non-brand queries because the brand has minimal awareness. SEO effort is correctly aimed at awareness-stage non-brand queries.
Established niche business: 20% to 35% branded. The brand has visibility within its niche and direct-search demand is meaningful, but the majority of acquisition still depends on non-brand discovery.
Mature high-awareness brand: 35% to 60% branded. The brand is recognized broadly and substantial search traffic is people looking specifically for the brand or its products.
Dominant category brand: 50%+ branded. The brand is the category. Most search traffic is brand-defense rather than acquisition.
16.2 When Brand Search Dominates
If brand search exceeds 50% on a business that isn't a dominant category brand, the signal is:
- High awareness exists (good)
- Top-funnel non-brand discovery is weak (concerning)
- Likely missing non-brand awareness queries
- May be over-indexed on brand-defense and under-indexed on category capture
Action: audit the non-brand opportunity set. Identify high-volume awareness-stage queries the brand doesn't currently rank for. Build content to address them.
16.3 When Non-Brand Dominates
If brand search is below 5% on a business that isn't brand new, the signal is:
- Low awareness (concerning for long-term retention)
- Acquisition depends entirely on continuous SEO performance
- Reputation and trust signals weaker than they should be
Action: invest in awareness-stage and brand-building activity. Press, partnerships, community presence, distinctive thought leadership. The SEO strategy should still be working, but it's working without the brand-trust multiplier that lifts conversion rates on every page.
16.4 Measuring the Ratio
From GSC:
- Performance report, set date range to last 12 months.
- Export all queries.
- Tag each query as branded or non-branded. Branded queries contain the brand name, common misspellings, product names, founder names, or distinctive brand phrases.
- Sum impressions or clicks for each category. Calculate ratio.
For sites without GSC history, estimate via direct traffic share (high direct traffic implies high brand recognition) and via branded search volume in Ahrefs or Semrush.
16.5 Branded Query Defense
Regardless of ratio, branded queries should always be defended. Brand-defense content includes:
- Homepage optimized for the exact brand name
- About page for "[brand] founder", "[brand] history", "[brand] team" patterns
- Pricing page for "[brand] pricing", "[brand] cost"
- Reviews page for "[brand] reviews", "[brand] testimonials"
- Comparison pages for "[brand] vs [competitor]" patterns
- Alternative pages for "[brand] alternatives", "alternatives to [brand]"
Each branded query pattern should have an owned page that captures the click before a third-party page intercepts it.
17. Zero-Volume Keyword Research
Keyword tools report zero monthly search volume when their sample size for a query falls below the tool's minimum-reporting threshold. This does not mean the query has zero searches. It means the tool can't measure them confidently.
17.1 GSC as the True Volume Source
Google Search Console reports actual impressions and clicks per query, not estimated volume. For queries where the site already has some ranking, GSC reveals real volume. Pattern observed across the TDG client roster:
- About 18% of GSC-tracked queries with non-zero impressions are reported as zero-volume in Ahrefs.
- About 24% of GSC-tracked queries with non-zero impressions are reported as zero-volume in Semrush.
- Some of these queries drive meaningful click traffic despite the tool labeling them dead.
Practical implication: GSC is the only ground truth for query volume on a site that already exists. For pre-launch sites or new content areas, zero-volume in tools is genuinely zero-volume more often, but the false-negative rate is still around 15% to 20%.
17.2 When Zero-Volume Queries Matter
Early-stage trends
A new technology or methodology will have zero reported volume until enough searches accumulate. Catching the trend early means ranking before competitors notice. Recent examples: "AEO" had zero reported volume in most tools through Q2 2024 despite being actively searched. Sites that built content in late 2023 captured the wave.
Technical specialty terms
Specific product names, library names, configuration patterns, error messages. These often have zero reported volume but the queries that do happen are extremely high-intent. A user searching "CGO_ENABLED=0 SQLite driver pure Go" is a developer with a specific problem and high conversion potential if the site sells relevant infrastructure or services.
Brand defense
Misspellings of the brand, alternate brand renderings, product code names. Often zero-volume but defended for completeness.
Long-tail conversational queries
Conversational queries are growing as AI Mode and voice search increase. Many conversational queries are zero-volume by traditional measurement but accumulate to meaningful traffic.
17.3 Zero-Volume Workflow
- After running standard keyword research, do not delete zero-volume queries from the working list.
- Tag zero-volume queries by hypothesis: early-stage trend, technical specialty, brand defense, conversational variant.
- For each tag, decide a coverage threshold. Early-stage trends might warrant a dedicated page if the underlying signal is strong. Technical specialty queries might warrant an FAQ entry. Brand defense queries might warrant a section on an existing page.
- Monitor GSC quarterly. Zero-volume queries that start generating impressions are validation. Promote them to active strategy.
- Zero-volume queries that never generate impressions after 12 months can be deprioritized, but not before.
17.4 Tool Limitations to Remember
Different tools have different zero-volume thresholds. A query reported as zero in Ahrefs may show 30 monthly volume in Semrush. Cross-checking across at least two tools and GSC is essential before classifying a query as genuinely zero.
18. Programmatic Keyword Research
Programmatic SEO generates pages at scale by varying inputs across a template. The keyword research input becomes a matrix rather than a list. Cross-ref framework-saas-seo.md for the programmatic SaaS context.
18.1 The Matrix Approach
Each programmatic strategy is defined by two or more axes. Each cell in the matrix is a unique page.
Two-axis examples:
- Locations x Services: cities x service offerings = local landing pages
- Integrations x Platforms: tools x supported platforms = integration pages
- Conditions x Treatments: medical conditions x treatment approaches = treatment pages
- Industries x Use Cases: industries x SaaS use cases = vertical pages
- Categories x Brands: product categories x brands = browse pages
- Templates x Industries: design templates x target industries = template gallery pages
Three-axis examples:
- Locations x Services x Time: city x service x season = seasonal local pages
- Categories x Brands x Attributes: product type x brand x feature = faceted browse
18.2 Sizing the Matrix
The total cell count is the product of axis sizes. A 100-city x 12-service matrix produces 1,200 pages. A 3-axis 100 x 50 x 12 produces 60,000 pages.
Page count by itself is not a quality signal. The question is whether each cell has:
- Real user demand (some search volume or strong intent signal)
- Differentiated content (not just template-filled)
- Internal coherence (the cell makes business sense)
A matrix that produces 60,000 pages where 50,000 are thin and undemanded is worse than a 1,200-page matrix where every page has clear demand.
18.3 When Programmatic Justifies Pages
Programmatic is appropriate when:
- The variant space genuinely has demand. "Plumber in Cassville" and "Plumber in Springfield" are both searched.
- The variants produce meaningfully different content. The Cassville page should differ from the Springfield page in service area, local references, customer examples, pricing if applicable.
- Internal linking can support the scale. Each programmatic page needs entry points from category hubs and from related variant pages.
- The site has the technical infrastructure to ship and maintain the pages without manual touch on each.
18.4 When Programmatic Produces Thin Content
Programmatic fails when:
- The variant has no real demand. Generating "Plumber in [tiny town with 200 residents]" produces a page that exists for no real searcher.
- The variants don't differ meaningfully. If the only difference between two pages is the city name swapped in 40 places, both pages are likely to be classified as thin or duplicate by Google.
- There's no internal linking architecture. Programmatic pages without entry points get crawl-deprioritized and de-indexed.
- The template lacks specificity. A template that says "We provide [service] in [city]. Call us for [service] in [city]." produces functionally identical pages that fail HCS.
See framework-hcs.md for the helpful content thresholds programmatic pages need to clear.
18.5 The Demand Validation Workflow
Before generating a programmatic matrix, validate demand:
- Pick 5 to 10 sample cells across the matrix range. Include high-population, mid-population, and low-population cells if location is an axis. Include common and uncommon combinations.
- Check search volume for each sample cell across Ahrefs, Semrush, and GSC if any data exists.
- If 70% or more sample cells show meaningful volume or strong intent signals, the matrix is viable.
- If fewer than 50% show demand, restructure: reduce one axis, combine cells, or kill the matrix.
- If results are mixed, generate the demand-validated subset only. Don't auto-generate the full matrix.
18.6 Per-Cell Differentiation Requirements
Each programmatic page needs three differentiation layers minimum:
- Variable-specific data (the city, the integration, the condition)
- Local or contextual color (a real example, a local statistic, an industry-specific application)
- Internal linking variation (links to related cells, parent category, sibling pages)
Pages that meet only the first layer are thin. Pages with all three layers can pass HCS thresholds and rank.
19. Competitor Keyword Gap Analysis 2026
Gap analysis identifies queries competitors rank for that you don't. In the AI Overview era, the gap matrix needs additional dimensions.
19.1 Standard Gap Tools
Ahrefs Content Gap
Input: 1 to 3 competitor domains plus your domain. Output: queries where competitors rank in top 10 and you don't rank in top 100. Sortable by volume, difficulty, intent.
Best use: standard ranking-gap analysis. The output is a candidate query list.
Semrush Keyword Gap
Same function as Ahrefs Content Gap with a different data set. Often surfaces different queries than Ahrefs because the underlying SERP databases differ. Running both and merging the outputs catches more candidates than either alone.
Moz Keyword Explorer
Less comprehensive but offers the Priority Score, which blends volume, difficulty, opportunity, and brand authority into a single metric. Useful when you need a one-number prioritization for a presentation.
19.2 The AI Overview Gap Matrix
Beyond ranking gaps, the 2026 matrix tracks:
AIO Citation Gap
Queries where competitors are cited in AIO panels and you are not. Different from ranking gap because a competitor might rank #4 but be cited in AIO, while a different competitor ranks #1 but is not cited.
Method: pull 50 priority queries. Manually inspect AIO citations across each. Tag each query with: competitor cited, you cited, neither, both. The "competitor cited, you not cited" cell is the AIO citation gap.
Cluster Coverage Gap
Queries where competitors have multiple cluster pages and you have one or none. This reveals topical authority gaps even when individual rankings look comparable.
Method: for each cluster in your strategy, count the pages each competitor publishes that target queries in the cluster. If a competitor has 12 pages on a cluster and you have 4, the cluster coverage gap is 8 pages, and the cluster is the strategic priority regardless of which individual queries you target first.
Schema and Entity Gap
Queries where competitors have richer schema markup or entity coverage. Less easy to surface from gap tools directly. Method: inspect competitor pages for top 20 priority queries, catalog the schemas used, identify schemas you don't use.
Velocity Gap
Queries where competitors are publishing new pages or updating existing ones at high cadence. A competitor publishing weekly on a topic is signaling intent to dominate. Velocity is measured via competitor sitemap or blog inspection.
19.3 The Combined Gap Score
For each candidate query, build a combined gap score:
query: "AEO best practices"
ranking_gap: 6 # competitor positions 3, 5, 8; you position 47
aio_citation_gap: 1 # competitor cited, you not
cluster_coverage_gap: 1 # competitor has 9 cluster pages, you have 2
schema_gap: 1 # competitor uses Article+FAQPage+HowTo, you use Article only
velocity_gap: 1 # competitor updated last month, your page is 18 months old
combined: 10
Sort candidate queries by combined gap score. Highest scores are the priority workload.
19.4 The Decision Matrix
For each high-gap query, decide:
- Build a new dedicated page (the gap is severe and the query justifies a standalone target)
- Expand an existing page to absorb the query (the gap is moderate and an existing page is close)
- Build a supporting cluster page that links to existing target (the gap is cluster-level rather than page-level)
- Accept the gap (the competitor's structural advantage makes the gap uneconomical to close)
The acceptance option is real. Some gaps require capabilities you don't have or investment that won't return. Skip those explicitly rather than pretending to address them.
20. Audit Mode
Three-tier audit rubric: per-keyword-research-project (15 items), site-wide keyword strategy (10 items), first 90 days subset (5 items).
20.1 Per Keyword Research Project
| # | Criterion | Pass/Fail |
|---|---|---|
| KP1 | Seed keywords documented with rationale | |
| KP2 | Keyword expansion from at least 3 sources (tools, SERP, competitors) | |
| KP3 | Intent classified for each candidate query | |
| KP4 | SERP composition verified for top 25 priority queries | |
| KP5 | AIO presence checked for top 25 priority queries | |
| KP6 | Long-tail subset extracted and AIO-tagged | |
| KP7 | Branded keyword set defended with owned pages | |
| KP8 | Zero-volume queries reviewed with hypothesis tagging | |
| KP9 | Sub-query fan-out documented for top 10 priority queries | |
| KP10 | Question keywords mined from AlsoAsked or equivalent | |
| KP11 | PAA tree mapped to depth 3 for pillar topics | |
| KP12 | Semantic clusters generated via SERP-overlap method | |
| KP13 | Cluster cannibalization checked across the strategy | |
| KP14 | Competitor gap matrix run with AIO citation column | |
| KP15 | Final keyword map approved and version-stamped |
Score: 15. World-class: 14+/15.
20.2 Site-Wide Keyword Strategy
| # | Criterion | Pass/Fail |
|---|---|---|
| KS1 | Documented keyword strategy exists and is current | |
| KS2 | Primary keywords mapped per important page | |
| KS3 | No detected keyword cannibalization or active resolution plan | |
| KS4 | Topic clusters defined with pillar and supporting pages | |
| KS5 | Branded vs non-branded ratio measured and appropriate for stage | |
| KS6 | AIO query classification applied (dominant, occasional, absent) | |
| KS7 | Sub-query coverage tracked at cluster level | |
| KS8 | Programmatic matrix demand-validated before generation | |
| KS9 | GSC monitored monthly for query performance and drift | |
| KS10 | Annual intent reassessment scheduled and last completed within 12 months |
Score: 10. World-class: 9+/10.
20.3 First 90 Days
| # | Criterion | Pass/Fail |
|---|---|---|
| KF1 | Seed keywords and competitor list documented | |
| KF2 | Top 50 priority queries identified with intent classified | |
| KF3 | AIO presence checked manually on top 50 priority queries | |
| KF4 | Keyword map for top 25 pages in place | |
| KF5 | Cannibalization audit run with at least one resolution applied |
Score: 5. World-class: 5/5.
21. Common Mistakes
- Optimizing for high volume regardless of intent — ranking for queries you can't convert wastes effort
- Ignoring search intent — content type doesn't match what users want
- Single-keyword focus instead of topic — modern SEO favors topical depth
- Keyword cannibalization unaddressed — multiple pages competing for same query
- No keyword mapping — content created without strategic alignment
- Volume tool numbers treated as exact — they're approximations
- Ignoring branded keywords — easy wins missed
- No long-tail strategy — depending only on head terms
- No SERP feature targeting — missing snippets, PAAs, AI Overviews
- Static keyword strategy — not updating as queries evolve
- Treating AIO-dominant queries like conventional rankings, where citation is the new metric for those queries
- Generating programmatic pages without demand validation, which produces thin pages at scale
- Discarding zero-volume queries reflexively, since many are early-stage trends or specialty terms with real demand
- Ignoring sub-query fan-out, because a parent query gets cited when the page answers sub-queries, not the parent string
- Skipping annual intent reassessment, because intent drift erodes rankings silently
22. Maintenance
Monthly: GSC query review. New queries to target. Ranking changes. AIO citation share tracking on priority queries.
Quarterly: Comprehensive keyword strategy review. Cannibalization audit. New cluster opportunities. Cluster coverage gap analysis against top 2 competitors.
Annually: Strategic keyword research refresh. Competitive landscape analysis. Full intent reassessment on top 50 priority queries per Section 12.3. Sub-query fan-out re-mapped for pillar clusters.
23. Quick Validation Script
For a quick site-wide audit of keyword coverage on a static site, the following bash script walks /var/www/sites/[domain]/ and flags pages missing primary keyword markers:
#!/bin/bash
SITE="$1"
ROOT="/var/www/sites/${SITE}"
if [ ! -d "$ROOT" ]; then
echo "Site root not found: $ROOT"
exit 1
fi
echo "Pages missing H1:"
find "$ROOT" -name "*.html" -type f | while read -r f; do
if ! grep -q "<h1" "$f"; then
echo " $f"
fi
done
echo ""
echo "Pages missing title tag:"
find "$ROOT" -name "*.html" -type f | while read -r f; do
if ! grep -q "<title>" "$f"; then
echo " $f"
fi
done
echo ""
echo "Pages with duplicate H1 candidates:"
find "$ROOT" -name "*.html" -type f -exec grep -l "<h1" {} \; | while read -r f; do
count=$(grep -c "<h1" "$f")
if [ "$count" -gt 1 ]; then
echo " $f ($count H1 tags)"
fi
done
This script is a starting point for the keyword mapping audit. Multiple H1 tags often indicate template issues that interfere with primary keyword signaling. Missing title tags mean a page can't compete for any query.
Companion documents:
framework-internallinking.md— Topic clusters require internal linking strategyframework-topicalauthority.md: cluster completeness scoring and topical authority systemsframework-featuredsnippets.md: on-page optimization for question and PAA queriesframework-aioverviews.md: AIO presence tracking and citation optimizationframework-aicitations.md— SERP features in AI eraframework-hcs.md: topic depth matters for Helpful Content; programmatic pages must clear HCS thresholdsframework-localseo.md— Local intent specificsframework-saas-seo.md: programmatic keyword strategy for SaaS contexts
Want this framework implemented on your site?
ThatDevPro ships these frameworks as productized services. SDVOSB-certified veteran owned. Cassville, Missouri.
See Engine Optimization service ›