Internal Linking: hub-and-spoke architecture, anchor text, crawl depth
A comprehensive installation and audit reference for internal linking — the discipline of using on-site links to communicate site architecture to crawlers, distribute ranking signals across pages,…
Hub-and-Spoke Architecture, Anchor Text Discipline, Topical Clusters, Crawl Depth, Orphan Detection, and Link Equity Distribution
A comprehensive installation and audit reference for internal linking — the discipline of using on-site links to communicate site architecture to crawlers, distribute ranking signals across pages, and guide users through topical depth. Internal linking is one of the highest-ROI SEO levers because it costs nothing per link added and compounds across the entire site. Dual-purpose: installation manual and audit document.
1. Document Purpose
This is the canonical reference for internal linking. Most sites have content that ranks. Most sites have technical SEO that works. Few sites have internal linking that does what it should. The gap shows up as orphan pages, shallow content, ranking signals trapped on the homepage, and topical clusters that exist on paper but not in the link graph.
In 2026, internal linking has become more important, not less. AI search engines parse internal link patterns to understand topical authority. Google's crawl-budget allocation favors well-linked pages. Topical-cluster ranking — a page ranks when its surrounding cluster ranks — is now a measurable phenomenon. Sites that use internal linking strategically outrank sites with better content but flatter link graphs.
1.1 Required Tools
- Screaming Frog SEO Spider — desktop crawler; reveals link graph, orphan pages, anchor text
- Sitebulb — desktop crawler with stronger reporting on crawl depth and link counts
- Ahrefs Site Explorer — Internal Backlinks — sitewide internal link audit
- Ahrefs Site Audit — Internal Linking — detects orphan pages, broken links
- Semrush Site Audit — Internal Linking — alternative
- Google Search Console — Links report — Google's view of internal linking
- Link Whisper (WordPress) — automated internal-link suggestion
- Custom Python + BeautifulSoup — for programmatic link extraction at scale
- Graphviz / yEd / Gephi — for visualizing link graphs
- Spreadsheet (Sheets or Excel) — link inventory management
1.2 Document Scope
Covers: site architecture patterns, hub-and-spoke / topical-cluster organization, anchor text discipline, crawl depth, orphan detection, link equity distribution, contextual vs navigational linking, breadcrumbs, faceted navigation handling, and pagination patterns. Touches but does not exhaust: keyword research and topic mapping (framework-keywordresearch.md), schema's BreadcrumbList (framework-schema.md), navigation UX (framework-uxseo.md).
2. Client Variables Intake
domain: ""
total_pages_indexed: 0
content_taxonomy_documented: false
hub_pages_identified: []
known_orphan_pages: []
existing_internal_link_strategy: "" # describe or "none"
sitemap_url_count: 0
crawl_depth_max_observed: 0 # from Sitebulb / Screaming Frog
average_internal_links_per_page: 0
top_traffic_pages: []
top_conversion_pages: []
known_cannibalization: [] # cross-reference framework-keywordresearch.md
3. The Three Functions of Internal Links
Every internal link does three things at once:
- Architectural — tells crawlers and users that the linked page exists and matters.
- Topical — tells crawlers what the linked page is about (via anchor text and surrounding content).
- Equity — passes a portion of the source page's ranking signals to the target.
A weak link does one. A strong link does all three.
4. Site Architecture Patterns
4.1 The Hub-and-Spoke Model
The dominant architecture pattern for content sites and most service businesses.
Homepage (root hub)
/ | \
Hub A Hub B Hub C
/ \ | / \
Sub Sub Sub Sub Sub Sub
\ \ | / / /
[back-links to hub from each spoke]
The pattern:
- Pillar / hub pages cover a broad topic comprehensively. They link out to all relevant sub-pages.
- Cluster / spoke pages cover narrow sub-topics in depth. They link back to their hub.
- Cross-cluster links are sparing and used only when topically warranted.
Why it works:
- Hubs accumulate link equity from spokes.
- Spokes signal topical depth around the hub.
- Crawlers see a clear topical structure.
- Users navigate naturally between depth levels.
4.2 The Mesh Model
For wikis, knowledge bases, and densely interconnected content, a mesh works better than strict hubs.
- Every page links to many topically related pages.
- No strict hierarchy.
- Depth comes from cross-references, not parent-child.
Wikipedia is the canonical mesh. It's appropriate for sites with hundreds of densely related entities. Most agency clients should use hub-and-spoke instead.
4.3 The Catalog Model (Ecommerce)
For ecommerce:
- Homepage → Category → Subcategory → Product
- Cross-links between related products
- Cross-links from category to comparison guides (cluster overlay)
Cross-reference: framework-ecommerceseo.md.
4.4 The Editorial Model (Publishers)
For news and editorial sites:
- Section → Article
- Articles link forward to follow-up coverage
- Articles link back to evergreen explainer pages
- "Related articles" widgets serve as soft hubs
Cross-reference: framework-newsseo.md.
5. Topical Clusters
A topical cluster is a hub page plus all the spokes that cover the topic. Clusters are how modern sites compete for broad topics they couldn't dominate with a single page.
5.1 Cluster Anatomy
For a cluster on "Local SEO":
HUB: /topics/local-seo/ (pillar page, comprehensive overview)
SPOKES:
/topics/local-seo/google-business-profile/
/topics/local-seo/local-citations/
/topics/local-seo/local-pack-ranking/
/topics/local-seo/review-management/
/topics/local-seo/local-link-building/
/topics/local-seo/local-schema/
/topics/local-seo/local-content-strategy/
Each spoke:
- Links back to the hub in the opening paragraph
- Links back to the hub in the conclusion
- Links to 2-3 sister spokes in topically appropriate places
- Does not link to spokes from unrelated clusters
The hub:
- Links to every spoke at least once
- Includes a structured "in this cluster" section listing spokes
- Updates as new spokes are added
5.2 Cluster Sizing
| Cluster size | Strategy |
|---|---|
| 3-5 spokes | Minimum viable cluster. Common starting point. |
| 6-10 spokes | Healthy cluster. Most competitive topics need this depth. |
| 11-20 spokes | Authoritative cluster. Often the bar for ranking head terms. |
| 20+ spokes | Pillar treatment. Reserved for category-defining hubs. |
5.3 Cluster Identification
For existing sites:
- Inventory all content pages.
- Group by topical theme.
- Identify the strongest existing piece per theme as the candidate hub.
- Document the cluster gap (sub-topics not yet covered).
- Decide: build out the cluster, or consolidate into fewer pages?
For new sites:
- Start with keyword research (
framework-keywordresearch.md). - Identify head term per cluster.
- Decide on hub URL and 5-10 starting spokes.
- Build hub first; spokes can roll out over weeks.
6. Anchor Text Discipline
Anchor text — the visible link text — is one of the strongest topical signals to Google.
6.1 Anchor Text Types
| Type | Example | When to use |
|---|---|---|
| Exact match | <a href="/local-seo/">local SEO</a> |
Sparingly; risk of over-optimization |
| Partial match | <a href="/local-seo/">our guide to local SEO</a> |
Most common; safest |
| Branded | <a href="/local-seo/">read more on ThatDeveloperGuy</a> |
Brand-building; less topical signal |
| Generic | <a href="/local-seo/">click here</a> |
Avoid. Wastes the topical signal. |
| Naked URL | <a href="/local-seo/">thatdeveloperguy.com/local-seo/</a> |
Avoid for internal links; fine for citations |
| Image | image with alt="local SEO guide" |
Alt text serves as anchor; use descriptive alt |
6.2 Anchor Text Patterns That Hurt
- "Click here", "read more", "learn more" — wasted signal. Especially bad on important hub links.
- The same exact-match anchor on every link to a page — over-optimization, looks unnatural.
- Mismatched anchor and target — anchor says "plumbing" but page is about "electrical work."
- Stuffed anchors —
<a>local SEO services Cassville Missouri SDVOSB</a>reads as keyword stuffing.
6.3 Healthy Anchor Distribution
For a typical hub page with many internal pointers, anchor text should vary naturally:
- ~30% exact or partial-match topic anchors
- ~30% partial-match with surrounding context ("our guide to...", "more on...")
- ~20% branded or contextual ("we've covered this on the blog", "see our process")
- ~20% incidental (in-prose mentions where the anchor reflects the surrounding sentence)
This distribution emerges naturally from good editorial writing. It rarely emerges from mass-produced internal-link automation.
7. Crawl Depth
Crawl depth is the number of clicks from the homepage to a given page. It is one of the strongest predictors of which pages Google indexes and ranks.
7.1 Targets
| Site size | Maximum acceptable crawl depth |
|---|---|
| Small (under 100 pages) | 3 clicks |
| Medium (100-1,000 pages) | 4 clicks |
| Large (1,000-10,000 pages) | 5 clicks |
| Massive (10,000+ pages) | 6 clicks (with strong sitemap support) |
Pages buried deeper rarely accumulate link equity, rarely get crawled frequently, and rarely rank.
7.2 Crawl Depth vs URL Path Depth
These are different. URL path depth (/a/b/c/d/page/ = depth 5) is unrelated to crawl depth (clicks from homepage). A page at /a/b/c/d/page/ can be crawl-depth 2 if linked from the homepage. A page at /page/ can be crawl-depth 6 if buried behind paginated archives.
Optimize for crawl depth. URL depth doesn't matter to crawlers.
7.3 Reducing Crawl Depth
Common pages buried too deep:
- Old blog posts — accessible only via paginated archives at /page/2/, /page/3/, etc.
- Product variants — buried under category → subcategory → product → variant
- Deep category pages — buried under multi-level taxonomies
- Tag archives — typically dead ends
Fixes:
- Add hub pages that link directly to important deep content
- Create "Best of" / "Most popular" sections on the homepage or category pages
- Link from new content to high-value old content
- Build cluster hubs that surface deep content
- Use breadcrumbs (counts toward crawl depth from any breadcrumb-equipped page)
8. Orphan Pages
An orphan page is one with zero internal links pointing to it. Orphans:
- Do not get crawled (or are crawled rarely)
- Do not accumulate ranking signals
- Often do not rank at all
8.1 Detection
Screaming Frog method:
- Crawl the site
- Crawl Analysis → Configure → enable "Crawl Analysis"
- Run Crawl Analysis after main crawl completes
- Reports → Orphan Pages
Sitebulb method: Auto-detects in standard reports.
Ahrefs / Semrush: Site audit reports orphans.
XML sitemap cross-reference: Compare sitemap URLs against URLs found via crawl. Difference set = orphans.
8.2 Resolution
For each orphan, decide:
- Should this page exist? If not, 410 it.
- If yes, where should it be linked from? Identify the natural parent in the site structure.
- Add at least 3 inbound internal links before considering the orphan resolved.
8.3 Prevention
- Editorial workflow: never publish a page without identifying at least 2-3 places to link to it from.
- Hub-and-spoke discipline: every spoke is added to its hub when published.
- Internal-linking review at publish: spend 5 minutes after every publish placing inbound links.
9. The Link Equity Lens
Link equity (informally "link juice") is the ranking-signal capital a page accumulates from internal and external links. Internal linking redistributes equity within the site.
9.1 PageRank Distribution Logic (Conceptual)
Every page has some equity. It distributes equity to all the pages it links to (divided by the number of outgoing links). The pages that receive the most equity rank best.
Implications:
- The homepage has the most equity (highest external inbound).
- Every link from the homepage divides its equity by the number of outbound links.
- Hub pages accumulate equity by being linked from many sources within the site.
- A page with many outbound links passes less equity per link.
9.2 The 100-Link Heuristic
Google indicated decades ago that 100 links per page was a soft maximum. The modern reality: there is no hard limit, but pages with 200+ outbound links pass diluted equity.
For a typical content page, target:
- 3-7 contextual outbound links to other site pages
- Plus navigation links (header, footer, breadcrumbs)
- Plus footer links
A page with 50+ outbound links should be questioned.
9.3 nofollow on Internal Links
Don't nofollow internal links unless there's a specific reason (e.g., blocking infinite-spawn faceted URLs from crawl). Old SEO advice to nofollow login or registration links is outdated — Google handles those fine without intervention.
9.4 Strategic Equity Concentration
Identify the 5-10 pages on the site that should rank highest (commercial pages, top-converting pages, hub pages). For these, intentionally:
- Link from the homepage prominently
- Link from every related blog post
- Link from major content hubs
- Use varied descriptive anchor text
Equity flow is editable through linking decisions.
10. Navigation Linking
The header, footer, and sidebar are the "navigation layer" — links that appear on every page (or every page in a section).
10.1 Header Navigation
Should contain links to:
- Top-level commercial pages (Services, Pricing, Contact)
- Top-level content hubs (Blog, Resources)
- The single highest-priority CTA
Should NOT contain:
- Every section the site has (mega-menus only when truly needed)
- Promotional / temporary content
- Deep pages that should rank organically (concentrating equity on every page is wasted)
10.2 Footer Navigation
A natural place for:
- Sitemap-style links to important pages
- Trust signals (about, contact, privacy, terms)
- Service pages organized logically
- Brand and credential links
The footer "supplies" link equity to a wider page set than the header. Use this deliberately.
10.3 Mega-Menus
Mega-menus (multi-column dropdowns with dozens of links) work well for ecommerce and large content sites — but the link count from every page is significant. For smaller sites, mega-menus dilute rather than help.
10.4 Breadcrumbs
Breadcrumbs add internal links (parent and grandparent pages) to every non-homepage. They:
- Reduce crawl depth (every breadcrumb-enabled page is closer to root)
- Add BreadcrumbList schema (cross-reference:
framework-schema.md) - Improve user navigation
- Display in SERPs as a navigation aid
Implement on every non-homepage. Always.
11. Contextual Linking
Contextual links are in-prose links within editorial content. They are the strongest internal links because:
- Surrounding content reinforces the topical signal
- Anchor text is naturally varied
- They appear in the part of the page Google weighs most heavily
11.1 The Editorial Discipline
Every editorial publish should include 3-5 contextual links to other pages on the site. These should appear:
- Once in the introduction (link forward to related topic)
- 1-2 times in the body (link to specific related sub-topics)
- Once in the conclusion (link to the next logical step)
This is not link manipulation. This is good editorial practice.
11.2 The "Updated Old Posts" Pattern
When publishing new content:
- Identify 3-5 old posts that should now reference the new content.
- Add contextual links from those posts to the new one.
- Update the old posts'
dateModified.
This is one of the highest-ROI SEO activities. Costs 15 minutes per publish; compounds across the site.
11.3 The "Linked Mentions" Audit
Spot-check old content quarterly. For every brand, product, or topic mention in old posts, verify it links to the appropriate destination page. Mentions without links are wasted internal-linking opportunities.
12. Faceted Navigation
Faceted navigation is the filter/sort UI common on ecommerce and listing sites. It generates URLs combinatorially:
/category/?color=red&size=m&brand=acme
12.1 The Crawl Budget Problem
A faceted system with 5 facets and 5 options each generates 5^5 = 3,125 URL variations. With 10 facets, the count explodes. Crawlers can spend their entire budget on faceted URLs.
12.2 Strategies
Block facet parameters in robots.txt:
Disallow: /*?color=
Disallow: /*?size=
Disallow: /*?sort=
Canonical to parameterless version:
Each faceted URL canonicals to the unfiltered category page.
noindex on facet combinations:
<meta name="robots" content="noindex, follow"> on faceted URLs. They get crawled but not indexed.
Whitelist facet combinations that should rank:
Some facet combinations are valuable landing pages ("red running shoes"). Whitelist these as canonical, indexable, and present in the sitemap. Block the rest.
12.3 Internal Linking with Facets
- Faceted URLs should not be linked from elsewhere on the site (don't pass equity into pages you don't index).
- Whitelisted facet pages get treated as normal landing pages — link to them from category pages.
Cross-reference: framework-ecommerceseo.md.
13. Pagination
Paginated archives (/page/2/, /page/3/) need careful handling.
13.1 The rel=next/prev Pattern (Deprecated)
Google deprecated rel=next/prev as a ranking signal in 2019. The link tags can still be present but Google ignores them.
13.2 Modern Pagination Patterns
Pattern 1: Self-canonical paginated pages
Every paginated page self-canonicals (page 2 canonicals to itself). All paginated pages may be indexed. Best when each page has substantively different content.
Pattern 2: Canonical to page 1
Page 2, 3, 4 canonical to page 1. Only page 1 is indexed. Best when paginated pages are mostly listing pagination with no unique value.
Pattern 3: noindex paginated pages
Page 2+ are noindex,follow. They pass internal links but don't appear in search. Best when you want crawl access but no SERP presence.
13.3 Internal Linking with Pagination
- Provide direct links to important deep content from elsewhere on the site rather than relying on pagination as the only access path.
- For long archives, add "Browse by year" or "Browse by category" overlays that link directly to specific posts without pagination.
- Don't rely on infinite-scroll-only patterns; ensure crawler can reach all content via traditional links.
14. Linking Hygiene
14.1 Broken Links
Internal 4xx links waste crawl budget, frustrate users, and signal site neglect. Audit:
- Screaming Frog → Internal → Status code filter for 4xx
- Sitebulb → Internal Links report
- Ahrefs Site Audit → Broken Internal Links report
Fix:
- Update the link to point to the new canonical URL
- 301-redirect the broken target if it should still exist
- 410 the broken target if it shouldn't
14.2 Redirect Chains in Internal Links
An internal link pointing to a URL that 301-redirects: the link should be updated to point to the final destination directly. Why:
- Saves crawler an extra hop
- Eliminates dependency on the 301 staying in place
- Marginally improves user perceived performance
Find with Screaming Frog → Redirect chain report.
14.3 Cross-Domain Internal Links
Sites that span multiple domains (multibrand portfolios, microsite networks) treat cross-domain links as external for SEO purposes. There is no internal link equity flow between separate domains, even if they're owned by the same entity.
For multibrand operations, decide deliberately:
- Single domain with subfolders (
/brand-a/,/brand-b/) — true internal linking, shared equity - Subdomains (
brand-a.example.com,brand-b.example.com) — partially shared, weaker - Separate domains — no internal-link equity flow
Cross-reference: framework-multibrand.md (when built).
15. Audit Mode
| # | Criterion | Pass/Fail |
|---|---|---|
| IL1 | Site has documented topical taxonomy / cluster strategy | |
| IL2 | Hub pages identified for each major topic | |
| IL3 | Each spoke page links back to its hub at least twice | |
| IL4 | Each hub page links to all its spokes | |
| IL5 | Cross-cluster links used sparingly and topically | |
| IL6 | Crawl depth report shows zero pages over depth 3 (small sites) / 5 (large sites) | |
| IL7 | Zero orphan pages detected (Screaming Frog Crawl Analysis) | |
| IL8 | Anchor text varies naturally; no exact-match overuse | |
| IL9 | Anchor text contains zero "click here" / "read more" without context | |
| IL10 | Breadcrumbs implemented on every non-homepage with BreadcrumbList schema | |
| IL11 | Header navigation focused on top commercial + content hubs only | |
| IL12 | Footer provides supplemental link layer to important pages | |
| IL13 | Average outbound internal links per content page between 5 and 30 | |
| IL14 | Zero broken internal links (4xx) | |
| IL15 | Zero internal links pointing to redirect chains | |
| IL16 | Faceted navigation strategy documented (block / canonical / whitelist) | |
| IL17 | Pagination strategy documented (self-canonical / canonical-to-1 / noindex) | |
| IL18 | Top 10 commercial pages each have 20+ inbound internal links | |
| IL19 | Top traffic pages link to conversion pages contextually | |
| IL20 | Editorial workflow includes inbound-link placement step | |
| IL21 | "Updated old posts" pattern practiced when new content publishes | |
| IL22 | nofollow not used on internal links (no exceptions, or documented exceptions) | |
| IL23 | Cluster gap analysis completed in last 90 days | |
| IL24 | Internal link audit run in last 30 days | |
| IL25 | Cross-domain links treated correctly per multibrand strategy |
Score: 25. World-class: 23+/25.
16. Common Mistakes
- No clear hub pages. Every page is at the same level; nothing accumulates topical authority.
- Spokes that don't link back to the hub. Cluster signal lost.
- "Click here" / "read more" anchors. Wastes the topical signal on every internal link.
- Orphan pages. Indexed but invisible to the link graph.
- Crawl depth over 5. Important pages buried.
- Broken internal links unfixed. Cumulative crawl budget waste.
- Internal links to redirect chains. Same — extra hop.
- Mega-menu on a small site. Diluting equity on every page.
- Faceted URLs without robots / canonical strategy. Crawl budget exploded by combinatorial parameters.
- No breadcrumbs. Free internal-linking layer ignored.
- Editorial publishes new content without inbound link placement. Page launches as orphan.
- Old content never updated. Anchors mention things that should now link to newer pages.
- Anchor text identical on every internal link to a page. Looks manipulated; sometimes triggers algorithmic suppression.
- Footer linking to every page on the site. Equity diluted; signals devalued.
- Treating cross-domain links as internal. Equity does not flow.
- Infinite-scroll archives with no traditional link path. Pages discoverable only by JS-rendered scroll events.
- JavaScript-only navigation. Some crawlers see it; others don't.
- Cross-cluster spoke-to-spoke linking everywhere. Weakens cluster boundaries.
17. Maintenance
Weekly:
- Spot-check newly published content for inbound and outbound links
- Verify new content added to relevant hub page
Monthly:
- Sitewide broken-link scan (Screaming Frog or alternative)
- Orphan-page report review
- New publish "Updated old posts" pass
- Anchor-text spot audit on top 10 pages
Quarterly:
- Comprehensive internal-link audit
- Crawl depth review (Sitebulb)
- Cluster gap analysis
- Hub page link inventory refresh
- Faceted navigation audit (if applicable)
- Pagination strategy review
Annually:
- Full link graph visualization (Gephi / yEd)
- Cluster structure review against keyword research
- Site architecture review (do hubs still match priorities?)
- Multibrand cross-domain linking review (if applicable)
18. Companion Documents
framework-keywordresearch.md— Topic clusters depend on keyword researchframework-schema.md— BreadcrumbList implementationframework-technicalseo.md— Crawl depth, faceted URL handling, paginationframework-ecommerceseo.md— Faceted navigation deep diveframework-newsseo.md— Editorial linking patternsframework-uxseo.md— Navigation UX patterns supporting SEOframework-hcs.md— Topical depth as a Helpful Content signalframework-eeat.md— Topical authority as an Expertise signalframework-pageexperience.md— Navigation tap-target sizing
Document version: 1.0 Last updated: 2026-05-05 Owner: Joseph W. Anady — ThatDeveloperGuy — SDVOSB
Want this framework implemented on your site?
ThatDevPro ships these frameworks as productized services. SDVOSB-certified veteran owned. Cassville, Missouri.
See Engine Optimization service ›