SEO & AI Engine Optimization Framework · May 2026

Internal Linking: hub-and-spoke architecture, anchor text, crawl depth

A comprehensive installation and audit reference for internal linking — the discipline of using on-site links to communicate site architecture to crawlers, distribute ranking signals across pages,…

Hub-and-Spoke Architecture, Anchor Text Discipline, Topical Clusters, Crawl Depth, Orphan Detection, and Link Equity Distribution

A comprehensive installation and audit reference for internal linking — the discipline of using on-site links to communicate site architecture to crawlers, distribute ranking signals across pages, and guide users through topical depth. Internal linking is one of the highest-ROI SEO levers because it costs nothing per link added and compounds across the entire site. Dual-purpose: installation manual and audit document.


1. Document Purpose

This is the canonical reference for internal linking. Most sites have content that ranks. Most sites have technical SEO that works. Few sites have internal linking that does what it should. The gap shows up as orphan pages, shallow content, ranking signals trapped on the homepage, and topical clusters that exist on paper but not in the link graph.

In 2026, internal linking has become more important, not less. AI search engines parse internal link patterns to understand topical authority. Google's crawl-budget allocation favors well-linked pages. Topical-cluster ranking — a page ranks when its surrounding cluster ranks — is now a measurable phenomenon. Sites that use internal linking strategically outrank sites with better content but flatter link graphs.

1.1 Required Tools

1.2 Document Scope

Covers: site architecture patterns, hub-and-spoke / topical-cluster organization, anchor text discipline, crawl depth, orphan detection, link equity distribution, contextual vs navigational linking, breadcrumbs, faceted navigation handling, and pagination patterns. Touches but does not exhaust: keyword research and topic mapping (framework-keywordresearch.md), schema's BreadcrumbList (framework-schema.md), navigation UX (framework-uxseo.md).


2. Client Variables Intake

domain: ""
total_pages_indexed: 0
content_taxonomy_documented: false
hub_pages_identified: []
known_orphan_pages: []
existing_internal_link_strategy: ""    # describe or "none"
sitemap_url_count: 0
crawl_depth_max_observed: 0            # from Sitebulb / Screaming Frog
average_internal_links_per_page: 0
top_traffic_pages: []
top_conversion_pages: []
known_cannibalization: []              # cross-reference framework-keywordresearch.md

3. The Three Functions of Internal Links

Every internal link does three things at once:

  1. Architectural — tells crawlers and users that the linked page exists and matters.
  2. Topical — tells crawlers what the linked page is about (via anchor text and surrounding content).
  3. Equity — passes a portion of the source page's ranking signals to the target.

A weak link does one. A strong link does all three.


4. Site Architecture Patterns

4.1 The Hub-and-Spoke Model

The dominant architecture pattern for content sites and most service businesses.

                  Homepage (root hub)
                  /         |         \
              Hub A      Hub B       Hub C
              /  \         |         /  \
           Sub  Sub     Sub Sub    Sub  Sub
            \    \       |   /     /    /
             [back-links to hub from each spoke]

The pattern:

Why it works:

4.2 The Mesh Model

For wikis, knowledge bases, and densely interconnected content, a mesh works better than strict hubs.

Wikipedia is the canonical mesh. It's appropriate for sites with hundreds of densely related entities. Most agency clients should use hub-and-spoke instead.

4.3 The Catalog Model (Ecommerce)

For ecommerce:

Cross-reference: framework-ecommerceseo.md.

4.4 The Editorial Model (Publishers)

For news and editorial sites:

Cross-reference: framework-newsseo.md.


5. Topical Clusters

A topical cluster is a hub page plus all the spokes that cover the topic. Clusters are how modern sites compete for broad topics they couldn't dominate with a single page.

5.1 Cluster Anatomy

For a cluster on "Local SEO":

HUB: /topics/local-seo/  (pillar page, comprehensive overview)

SPOKES:
  /topics/local-seo/google-business-profile/
  /topics/local-seo/local-citations/
  /topics/local-seo/local-pack-ranking/
  /topics/local-seo/review-management/
  /topics/local-seo/local-link-building/
  /topics/local-seo/local-schema/
  /topics/local-seo/local-content-strategy/

Each spoke:

The hub:

5.2 Cluster Sizing

Cluster size Strategy
3-5 spokes Minimum viable cluster. Common starting point.
6-10 spokes Healthy cluster. Most competitive topics need this depth.
11-20 spokes Authoritative cluster. Often the bar for ranking head terms.
20+ spokes Pillar treatment. Reserved for category-defining hubs.

5.3 Cluster Identification

For existing sites:

  1. Inventory all content pages.
  2. Group by topical theme.
  3. Identify the strongest existing piece per theme as the candidate hub.
  4. Document the cluster gap (sub-topics not yet covered).
  5. Decide: build out the cluster, or consolidate into fewer pages?

For new sites:

  1. Start with keyword research (framework-keywordresearch.md).
  2. Identify head term per cluster.
  3. Decide on hub URL and 5-10 starting spokes.
  4. Build hub first; spokes can roll out over weeks.

6. Anchor Text Discipline

Anchor text — the visible link text — is one of the strongest topical signals to Google.

6.1 Anchor Text Types

Type Example When to use
Exact match <a href="/local-seo/">local SEO</a> Sparingly; risk of over-optimization
Partial match <a href="/local-seo/">our guide to local SEO</a> Most common; safest
Branded <a href="/local-seo/">read more on ThatDeveloperGuy</a> Brand-building; less topical signal
Generic <a href="/local-seo/">click here</a> Avoid. Wastes the topical signal.
Naked URL <a href="/local-seo/">thatdeveloperguy.com/local-seo/</a> Avoid for internal links; fine for citations
Image image with alt="local SEO guide" Alt text serves as anchor; use descriptive alt

6.2 Anchor Text Patterns That Hurt

6.3 Healthy Anchor Distribution

For a typical hub page with many internal pointers, anchor text should vary naturally:

This distribution emerges naturally from good editorial writing. It rarely emerges from mass-produced internal-link automation.


7. Crawl Depth

Crawl depth is the number of clicks from the homepage to a given page. It is one of the strongest predictors of which pages Google indexes and ranks.

7.1 Targets

Site size Maximum acceptable crawl depth
Small (under 100 pages) 3 clicks
Medium (100-1,000 pages) 4 clicks
Large (1,000-10,000 pages) 5 clicks
Massive (10,000+ pages) 6 clicks (with strong sitemap support)

Pages buried deeper rarely accumulate link equity, rarely get crawled frequently, and rarely rank.

7.2 Crawl Depth vs URL Path Depth

These are different. URL path depth (/a/b/c/d/page/ = depth 5) is unrelated to crawl depth (clicks from homepage). A page at /a/b/c/d/page/ can be crawl-depth 2 if linked from the homepage. A page at /page/ can be crawl-depth 6 if buried behind paginated archives.

Optimize for crawl depth. URL depth doesn't matter to crawlers.

7.3 Reducing Crawl Depth

Common pages buried too deep:

Fixes:


8. Orphan Pages

An orphan page is one with zero internal links pointing to it. Orphans:

8.1 Detection

Screaming Frog method:

  1. Crawl the site
  2. Crawl Analysis → Configure → enable "Crawl Analysis"
  3. Run Crawl Analysis after main crawl completes
  4. Reports → Orphan Pages

Sitebulb method: Auto-detects in standard reports.

Ahrefs / Semrush: Site audit reports orphans.

XML sitemap cross-reference: Compare sitemap URLs against URLs found via crawl. Difference set = orphans.

8.2 Resolution

For each orphan, decide:

8.3 Prevention


9. The Link Equity Lens

Link equity (informally "link juice") is the ranking-signal capital a page accumulates from internal and external links. Internal linking redistributes equity within the site.

9.1 PageRank Distribution Logic (Conceptual)

Every page has some equity. It distributes equity to all the pages it links to (divided by the number of outgoing links). The pages that receive the most equity rank best.

Implications:

9.2 The 100-Link Heuristic

Google indicated decades ago that 100 links per page was a soft maximum. The modern reality: there is no hard limit, but pages with 200+ outbound links pass diluted equity.

For a typical content page, target:

A page with 50+ outbound links should be questioned.

9.3 nofollow on Internal Links

Don't nofollow internal links unless there's a specific reason (e.g., blocking infinite-spawn faceted URLs from crawl). Old SEO advice to nofollow login or registration links is outdated — Google handles those fine without intervention.

9.4 Strategic Equity Concentration

Identify the 5-10 pages on the site that should rank highest (commercial pages, top-converting pages, hub pages). For these, intentionally:

Equity flow is editable through linking decisions.


10. Navigation Linking

The header, footer, and sidebar are the "navigation layer" — links that appear on every page (or every page in a section).

10.1 Header Navigation

Should contain links to:

Should NOT contain:

10.2 Footer Navigation

A natural place for:

The footer "supplies" link equity to a wider page set than the header. Use this deliberately.

10.3 Mega-Menus

Mega-menus (multi-column dropdowns with dozens of links) work well for ecommerce and large content sites — but the link count from every page is significant. For smaller sites, mega-menus dilute rather than help.

10.4 Breadcrumbs

Breadcrumbs add internal links (parent and grandparent pages) to every non-homepage. They:

Implement on every non-homepage. Always.


11. Contextual Linking

Contextual links are in-prose links within editorial content. They are the strongest internal links because:

11.1 The Editorial Discipline

Every editorial publish should include 3-5 contextual links to other pages on the site. These should appear:

This is not link manipulation. This is good editorial practice.

11.2 The "Updated Old Posts" Pattern

When publishing new content:

This is one of the highest-ROI SEO activities. Costs 15 minutes per publish; compounds across the site.

11.3 The "Linked Mentions" Audit

Spot-check old content quarterly. For every brand, product, or topic mention in old posts, verify it links to the appropriate destination page. Mentions without links are wasted internal-linking opportunities.


12. Faceted Navigation

Faceted navigation is the filter/sort UI common on ecommerce and listing sites. It generates URLs combinatorially:

/category/?color=red&size=m&brand=acme

12.1 The Crawl Budget Problem

A faceted system with 5 facets and 5 options each generates 5^5 = 3,125 URL variations. With 10 facets, the count explodes. Crawlers can spend their entire budget on faceted URLs.

12.2 Strategies

Block facet parameters in robots.txt:

Disallow: /*?color=
Disallow: /*?size=
Disallow: /*?sort=

Canonical to parameterless version:

Each faceted URL canonicals to the unfiltered category page.

noindex on facet combinations:

<meta name="robots" content="noindex, follow"> on faceted URLs. They get crawled but not indexed.

Whitelist facet combinations that should rank:

Some facet combinations are valuable landing pages ("red running shoes"). Whitelist these as canonical, indexable, and present in the sitemap. Block the rest.

12.3 Internal Linking with Facets

Cross-reference: framework-ecommerceseo.md.


13. Pagination

Paginated archives (/page/2/, /page/3/) need careful handling.

13.1 The rel=next/prev Pattern (Deprecated)

Google deprecated rel=next/prev as a ranking signal in 2019. The link tags can still be present but Google ignores them.

13.2 Modern Pagination Patterns

Pattern 1: Self-canonical paginated pages

Every paginated page self-canonicals (page 2 canonicals to itself). All paginated pages may be indexed. Best when each page has substantively different content.

Pattern 2: Canonical to page 1

Page 2, 3, 4 canonical to page 1. Only page 1 is indexed. Best when paginated pages are mostly listing pagination with no unique value.

Pattern 3: noindex paginated pages

Page 2+ are noindex,follow. They pass internal links but don't appear in search. Best when you want crawl access but no SERP presence.

13.3 Internal Linking with Pagination


14. Linking Hygiene

14.1 Broken Links

Internal 4xx links waste crawl budget, frustrate users, and signal site neglect. Audit:

Fix:

14.2 Redirect Chains in Internal Links

An internal link pointing to a URL that 301-redirects: the link should be updated to point to the final destination directly. Why:

Find with Screaming Frog → Redirect chain report.

14.3 Cross-Domain Internal Links

Sites that span multiple domains (multibrand portfolios, microsite networks) treat cross-domain links as external for SEO purposes. There is no internal link equity flow between separate domains, even if they're owned by the same entity.

For multibrand operations, decide deliberately:

Cross-reference: framework-multibrand.md (when built).


15. Audit Mode

# Criterion Pass/Fail
IL1 Site has documented topical taxonomy / cluster strategy
IL2 Hub pages identified for each major topic
IL3 Each spoke page links back to its hub at least twice
IL4 Each hub page links to all its spokes
IL5 Cross-cluster links used sparingly and topically
IL6 Crawl depth report shows zero pages over depth 3 (small sites) / 5 (large sites)
IL7 Zero orphan pages detected (Screaming Frog Crawl Analysis)
IL8 Anchor text varies naturally; no exact-match overuse
IL9 Anchor text contains zero "click here" / "read more" without context
IL10 Breadcrumbs implemented on every non-homepage with BreadcrumbList schema
IL11 Header navigation focused on top commercial + content hubs only
IL12 Footer provides supplemental link layer to important pages
IL13 Average outbound internal links per content page between 5 and 30
IL14 Zero broken internal links (4xx)
IL15 Zero internal links pointing to redirect chains
IL16 Faceted navigation strategy documented (block / canonical / whitelist)
IL17 Pagination strategy documented (self-canonical / canonical-to-1 / noindex)
IL18 Top 10 commercial pages each have 20+ inbound internal links
IL19 Top traffic pages link to conversion pages contextually
IL20 Editorial workflow includes inbound-link placement step
IL21 "Updated old posts" pattern practiced when new content publishes
IL22 nofollow not used on internal links (no exceptions, or documented exceptions)
IL23 Cluster gap analysis completed in last 90 days
IL24 Internal link audit run in last 30 days
IL25 Cross-domain links treated correctly per multibrand strategy

Score: 25. World-class: 23+/25.


16. Common Mistakes

  1. No clear hub pages. Every page is at the same level; nothing accumulates topical authority.
  2. Spokes that don't link back to the hub. Cluster signal lost.
  3. "Click here" / "read more" anchors. Wastes the topical signal on every internal link.
  4. Orphan pages. Indexed but invisible to the link graph.
  5. Crawl depth over 5. Important pages buried.
  6. Broken internal links unfixed. Cumulative crawl budget waste.
  7. Internal links to redirect chains. Same — extra hop.
  8. Mega-menu on a small site. Diluting equity on every page.
  9. Faceted URLs without robots / canonical strategy. Crawl budget exploded by combinatorial parameters.
  10. No breadcrumbs. Free internal-linking layer ignored.
  11. Editorial publishes new content without inbound link placement. Page launches as orphan.
  12. Old content never updated. Anchors mention things that should now link to newer pages.
  13. Anchor text identical on every internal link to a page. Looks manipulated; sometimes triggers algorithmic suppression.
  14. Footer linking to every page on the site. Equity diluted; signals devalued.
  15. Treating cross-domain links as internal. Equity does not flow.
  16. Infinite-scroll archives with no traditional link path. Pages discoverable only by JS-rendered scroll events.
  17. JavaScript-only navigation. Some crawlers see it; others don't.
  18. Cross-cluster spoke-to-spoke linking everywhere. Weakens cluster boundaries.

17. Maintenance

Weekly:

Monthly:

Quarterly:

Annually:


18. Companion Documents


Document version: 1.0 Last updated: 2026-05-05 Owner: Joseph W. Anady — ThatDeveloperGuy — SDVOSB

Want this framework implemented on your site?

ThatDevPro ships these frameworks as productized services. SDVOSB-certified veteran owned. Cassville, Missouri.

See Engine Optimization service ›