SEO & AI Engine Optimization Framework · May 2026

Hreflang: implementation patterns and pitfalls

A comprehensive installation and audit reference for hreflang, the most error-prone implementation in international SEO. While framework-international.md covers…

The Operational Deep Dive for Multi-Language and Multi-Region Sites, Covering Implementation, URL Structure, x-default, Pagination, Canonical Precedence, Validation, and the Audit Posture That Survives a Migration

A comprehensive installation and audit reference for hreflang, the most error-prone implementation in international SEO. While framework-international.md covers internationalization at the strategic and editorial level (which markets, how deeply to localize, content adaptation), this document is the operational technical reference: how the tags are formed, where they live, how they interact with canonical and pagination, how they fail, and how to validate that they did not. Dual-purpose: installation manual and audit document.

Cross-stack implementation note: the code samples in this framework are written in plain HTML for clarity. For React, Vue, Svelte, Next.js, Nuxt, SvelteKit, Astro, Hugo, 11ty, Remix, WordPress, Shopify, and Webflow equivalents of every pattern below, see framework-cross-stack-implementation.md. For pure client-rendered SPAs (no SSR/SSG) see framework-react.md. For Next.js specific patterns (App Router, generateMetadata, dynamic alternates) see framework-nextjs.md.


1. Document Purpose and How to Use This Document

1.1 What This Document Is

The canonical operational reference for hreflang, the HTML attribute (or HTTP header, or XML sitemap annotation) that tells Google and other consumer search engines which page version to serve to which language and country segment of the user base. Hreflang is the single most error-prone implementation surface in technical SEO. John Mueller has publicly described it as "one of the most complex aspects of SEO, if not the most complex one" (Google Search Central, multiple Office Hours, 2020-2025). Independent research consistently finds 60 to 75 percent of international sites have at least one hreflang implementation error severe enough to invalidate part of the cluster. Ahrefs analyzed 374,756 domains and found 67 percent had implementation issues; 31.02 percent had conflicting hreflang directives; 16.04 percent had no self-referencing tags; 8.91 percent used unknown language codes (Ahrefs Study, "Hreflang Implementation Study", 2024 sample, n=374,756 domains).

This framework specifies what correct hreflang looks like, how to deploy it across three implementation methods, how to resolve the canonical-versus-hreflang conflict deterministically, how to handle paginated series, when to use x-default and when to skip it, the validation tooling stack, and the most common ten anti-patterns with their fixes.

1.2 What This Document Is Not

Not a strategic framework for whether to internationalize, which markets to target, or how to localize content. That is framework-international.md. Not a CMS-agnostic schema reference, that is framework-schema.md. Not a substitute for the content-first architectural doctrine in framework-contentfirst.md, which establishes that the hreflang block must render in the first byte of the server response, not be injected by client-side JavaScript.

1.3 Three Operating Modes

Mode A, Install: build hreflang infrastructure on a new or rebuild engagement. Read Sections 2 through 12 in order.

Mode B, Audit: evaluate an existing implementation. Skip to Section 13.

Mode C, Hybrid: audit first, install for failing items. Most engagements run as Mode C because hreflang is rarely correct on legacy sites.

1.4 How Claude Code CLI Should Consume This Document

  1. Section 2: client variables (markets, languages, current URL structure, current hreflang status).
  2. Section 3: confirm operator understands what hreflang is and is not.
  3. Section 4: URL structure decision (ccTLD vs subdirectory vs subdomain). Decision is captured before tag generation.
  4. Section 5: pick implementation method (HTML head vs HTTP header vs XML sitemap).
  5. Section 6: decide whether x-default is required.
  6. Section 7: if site has paginated series, apply pagination interaction rules.
  7. Section 8: confirm canonical-hreflang relationship for every cluster.
  8. Section 9: decide regional sub-variant policy (en-US vs en-GB vs en-CA threshold).
  9. Section 10: run validation before deploy and after.
  10. Section 11: cross-check against the ten anti-patterns.
  11. Section 13: audit rubric.

1.5 Required Tools and Validators

1.6 Scope and Boundaries

Covers: hreflang format (language and region codes), the three implementation methods, x-default rules, pagination interaction, canonical precedence, regional sub-variant policy, validation methodology, common anti-patterns, monitoring after deprecation of the GSC International Targeting report, and the audit rubric. Touches but does not exhaust: URL structure strategy (framework-international.md), canonical signal stack (framework-technicalseo.md), schema markup (framework-schema.md), Next.js metadata API patterns (framework-nextjs.md), and CMS specifics (framework-wordpress.md, framework-shopify.md).


2. Client Variables Intake

# HREFLANG CLIENT VARIABLES

# Markets
target_countries: []           # ["US","GB","CA","AU","DE","FR","ES","MX","BR","JP"]
target_languages: []           # ["en","de","fr","es","pt","ja"]
language_country_matrix: []    # ["en-US","en-GB","en-CA","en-AU","de-DE","de-AT","de-CH",
                               #  "fr-FR","fr-CA","es-ES","es-MX","pt-BR","pt-PT","ja-JP"]
total_locales: 0
primary_market: ""             # the locale that gets traffic if no other matches
fallback_market: ""            # used as x-default candidate

# Content readiness
locales_with_full_translation: []
locales_with_partial_translation: []
locales_with_machine_translation_only: []
locales_with_localized_content: []   # currency, units, examples adapted
content_identity_us_vs_gb: ""        # "identical" | "spelling_only" | "fully_localized"
content_identity_es_es_vs_es_mx: ""  # same scale

# Current URL structure
url_structure: ""              # "cctld" | "subdirectory" | "subdomain" | "gtld_with_param" | "mixed"
domain_apex: ""                # primary domain
ccltd_inventory: []            # ["example.com","example.co.uk","example.de","example.fr"]
subdirectory_pattern: ""       # "/en-us/", "/us/en/", "/de/", "/fr-fr/"
subdomain_pattern: ""          # "us.example.com", "de.example.com"
url_case_policy: "lowercase"
trailing_slash_policy: ""      # with-slash | without-slash, must match technicalseo.md

# Current hreflang status
hreflang_present: false
hreflang_method: ""            # "html_head" | "http_header" | "xml_sitemap" | "mixed" | "none"
hreflang_locale_count: 0
hreflang_self_referencing: false
hreflang_bidirectional_verified: false
x_default_present: false
x_default_target: ""           # which URL is the x-default
canonical_self_referencing: false
canonical_points_to_alternate: false   # the broken pattern; should be false
known_hreflang_errors: []

# Pagination
paginated_series_present: false
pagination_types: []           # ["blog_archive","category","ecommerce_catalog","search_results"]
pagination_uses_rel_next_prev: false
pagination_self_canonical: false
pagination_has_hreflang: false

# Non-HTML resources
pdfs_internationalized: false
pdfs_use_http_header_hreflang: false

# Monitoring
gsc_property_per_locale: false
gsc_property_inventory: []
last_hreflang_validation_date: ""
hreflang_validation_tool: ""   # "screaming_frog" | "sitebulb" | "manual" | "none"
hreflang_change_management_process: ""   # how new locales get added

# Migration context
recent_url_migration: false
recent_locale_addition: false
recent_locale_removal: false

A field left blank during intake is an audit item. The hreflang_self_referencing, hreflang_bidirectional_verified, and canonical_points_to_alternate flags are the three primary signals for whether a cluster is functional or broken.


3. What Hreflang Is

3.1 The Working Definition

Hreflang is an HTML attribute (rel="alternate" hreflang="...") that declares, for a given URL, the alternate URLs that serve the same content for different language and region targets. It is consumed by Google web search, Yandex, and Naver. Bing publicly stated in 2016 that Bingbot ignores hreflang and uses the <html lang="..."> attribute and content-language HTTP header instead (Bing Webmaster Blog, "How Bing handles hreflang", 2016, still cited in 2025 Bing Webmaster documentation). Baidu does not consume hreflang. ChatGPT, Claude, Perplexity, and other AI search engines do not consume hreflang at this writing (May 2026), they consume the first-byte HTML and the visible content.

The narrow technical claim hreflang makes: "I, this URL, am one version of a content unit. Here are the other versions, and here is the language and optional region each targets." Hreflang does not declare canonicality. It does not declare which version Google should rank. It distributes a single ranking signal across a cluster so that the locale-matched version is the one shown in the SERP for a user in that locale.

3.2 What Hreflang Is Not

Not a canonical signal. Hreflang does not tell Google which version is the master. Canonical does that. Hreflang is orthogonal: it identifies a peer group, where every peer is self-canonical, and the cluster as a whole shares ranking signals while the locale-matched URL is shown in the SERP.

Not a ranking signal. Hreflang does not improve rankings. It improves the probability that the correct locale URL is shown when the cluster ranks. The cluster ranks based on the content, links, and authority of the URLs in the cluster.

Not a directive. Google has stated repeatedly through 2025 that hreflang is a "hint, not a directive" (John Mueller, Search Off the Record, March 2024; Google Search Central docs revision history). Google may serve a different URL than the hreflang-matched one if other signals are stronger, including content similarity, canonical signals, and user location.

Not the <html lang="..."> attribute. The lang attribute declares the language of the current document for accessibility and rendering (font fallbacks, spell-check, screen reader pronunciation). It does not declare alternate language versions. The two attributes are independent, but both should be present and consistent. Bing relies on <html lang> (and content-language HTTP header) instead of hreflang.

Not server-side content negotiation. Sites that serve different content for the same URL based on Accept-Language HTTP header are doing content negotiation, which is a separate technique. Content negotiation cannot be combined with hreflang because hreflang requires distinct URLs per locale. Mixing them is a common anti-pattern (Section 11).

3.3 The Relationship to Other Signals

signal_relationships:

  hreflang_vs_canonical:
    interaction: "peers, not conflict if implemented correctly"
    rule: "every URL in a cluster self-canonicals; hreflang declares the peer set"
    conflict_mode: "if canonical points to an alternate, hreflang is silently discarded by Google"

  hreflang_vs_html_lang:
    interaction: "both required; serve different consumers"
    rule: "html lang declares this document's language; hreflang declares alternates"
    consumer_split: "Google uses hreflang; Bing uses html lang and Content-Language"

  hreflang_vs_content_language_header:
    interaction: "both can coexist; Content-Language is hint for non-hreflang crawlers"
    rule: "Content-Language: en-US (or just en) in HTTP response"
    practical_use: "Bing, accessibility tools, browser language detection"

  hreflang_vs_country_in_gsc:
    interaction: "GSC country targeting deprecated September 22, 2022"
    rule: "for subdirectory and subdomain structures, GSC country targeting is no longer settable"
    consequence: "rely on ccTLD signal, hreflang, and on-page content cues only"

  hreflang_vs_geoip_redirect:
    interaction: "incompatible at the redirect layer"
    rule: "do not auto-redirect by IP; offer a banner with a link; let user choose"
    why: "Googlebot crawls from one IP region; auto-redirect blocks crawl coverage"

3.4 The Citation Surface for Hreflang in 2026

Google web search is the dominant consumer. Yandex consumes hreflang in much the same way. Naver consumes hreflang for Korean targeting. Bing ignores hreflang and uses html lang plus Content-Language. AI search engines (ChatGPT, Claude, Perplexity, Gemini, OpenAI Search) do not consume hreflang at all; they fetch the URL the user-facing rendering or referral surface points to, and they read the visible content. The 2026 implication: a site investing in hreflang is investing in Google SERP locale routing, not in AI citation routing. AI citation routing is decided by the visible content, the URL structure, and the linking topology of internal navigation, not by hreflang.


4. URL Structure Decision Tree

Before any hreflang tag is written, the URL structure question is settled. Hreflang annotates a URL structure; it does not create one. The four primary options and the decision rule for each.

4.1 The Four Options

Option A, ccTLDs (Country Code Top Level Domains).

example.com    -> US or global
example.co.uk  -> UK
example.de     -> Germany
example.fr     -> France
example.com.mx -> Mexico
example.com.br -> Brazil

Each country gets its own root domain. The country target is implied by the TLD itself: .de targets Germany, .fr targets France, with no further signal needed. Hreflang is still recommended because the same content can target multiple countries (.com for the US plus the UK plus AU), and because language can differ from country (German content on .ch for Swiss German).

Pros:

Cons:

Best for: large multinationals with budget and dedicated per-country operations, brands serving regulated industries where per-country legal entity separation is required, established global brands with separate country marketing teams.

Option B, Subdirectories on a gTLD.

example.com/        -> US default
example.com/en-gb/  -> UK English
example.com/de/     -> Germany
example.com/fr/     -> France
example.com/es-mx/  -> Mexico Spanish

A single gTLD (.com, .org, .net, .io, etc.) with locale segments in the URL path. Authority accumulates to one domain. Hreflang is mandatory because the TLD gives no country signal.

Pros:

Cons:

Best for: the default modern recommendation. SaaS, B2B, mid-market consumer brands, agencies, content sites. Most sites should choose subdirectories unless a specific reason requires ccTLDs.

Option C, Subdomains on a gTLD.

www.example.com    -> global
us.example.com     -> US
uk.example.com     -> UK
de.example.com     -> Germany
fr.example.com     -> France

Locale segments as subdomains. Authority distribution between subdomains is debated (Google has at various points said subdomains may be treated as separate sites for ranking purposes; current 2024-2025 guidance is "treated as part of the same site for crawling and indexing").

Pros:

Cons:

Best for: organizations with strong per-region IT teams that need infrastructure separation, sites where per-region brand identity is intentionally distinct enough to merit subdomain separation, sites where ccTLDs are out of reach but country separation is strategic.

Option D, gTLD with URL Parameter.

example.com/?lang=en
example.com/?country=us&lang=en

Locale carried as a query parameter rather than in the URL path.

Pros:

Cons:

Best for: legacy sites that cannot migrate. New builds should not use this pattern.

4.2 The Decision Tree

Does the business need per-country legal, billing, or operational separation?
  YES -> Use ccTLDs. Move to Section 5.
  NO  -> Continue.

Does the business have separate per-country marketing teams with separate budgets?
  YES -> Consider ccTLDs if budget allows; subdirectories if not.
  NO  -> Continue.

Does the business need per-region infrastructure separation (data residency, latency)?
  YES -> Consider subdomains or ccTLDs.
  NO  -> Continue.

Default -> Use subdirectories with locale segment (/en-us/, /de/, etc.) and rely on hreflang.

4.3 The Locale Segment Pattern

For subdirectory implementations, the locale segment format choice:

/en-us/   -> language-region (recommended for multi-region same-language)
/us/en/   -> region-language (less common; two-level)
/en/      -> language only (single-region or language-agnostic)
/us/      -> region only (cannot be used; hreflang requires language)

The recommended pattern is language-region as a single segment (/en-us/, /de-de/, /fr-ca/). It maps cleanly to the hreflang value. It is unambiguous to humans reading the URL. It allows easy aggregation in analytics. Hyphens, not underscores or slashes between language and region.

Lowercase throughout: /en-us/ not /en-US/. The hreflang attribute value is case-insensitive per Google docs, but lowercase in URLs is the trailing-slash-and-case-policy default from framework-technicalseo.md. The hreflang attribute value itself is conventionally written as en-US (uppercase region) in HTML even though the URL is lowercase. Both are valid.

4.4 The "Will We Add More Locales?" Future-Proofing

Plan the URL structure to accommodate locales that do not exist yet. A site that launches with /en/ and /es/ and later wants to add /en-gb/ faces an awkward transition because /en/ was implicitly US English. Better: launch with /en-us/ and /es-es/ even if there is only one of each, leaving room to add /en-gb/, /es-mx/ cleanly later.

If the site launches with language-only segments (/en/, /es/, /de/), use them only when sub-regions are explicitly out of scope and unlikely to be added. The migration cost from language-only to language-region segments is high (URL change for every page in every existing language, redirects, hreflang rewrites, sitemap rewrites).

4.5 The "Mixed Structure" Anti-Pattern

Sites that organically grew across regions sometimes end up with a mixed structure: .de as ccTLD for Germany, /fr/ as subdirectory for France, uk.example.com as subdomain for UK. Mixed structures are not invalid, but they are operationally harder, harder to audit, and create more places for hreflang to break. The migration cost to unify them is real, and the SEO benefit of unification is usually marginal compared to the cost. For an established mixed structure, the typical recommendation is: do not unify, but invest disproportionately in hreflang validation and per-property GSC monitoring.


5. Hreflang Implementation Methods

Three methods exist. Pick one (or for non-HTML resources, two). Mixing methods is permitted but adds maintenance surface.

5.1 Method 1: HTML Head Link Tags

The most common and most directly visible method. Place <link rel="alternate" hreflang="..."> tags in the <head> of every page that is part of an hreflang cluster.

<head>
  <!-- self -->
  <link rel="alternate" hreflang="en-US" href="https://example.com/en-us/products/widget/">
  <!-- peers -->
  <link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/products/widget/">
  <link rel="alternate" hreflang="en-CA" href="https://example.com/en-ca/products/widget/">
  <link rel="alternate" hreflang="es-ES" href="https://example.com/es-es/products/widget/">
  <link rel="alternate" hreflang="es-MX" href="https://example.com/es-mx/products/widget/">
  <link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/products/widget/">
  <link rel="alternate" hreflang="fr-FR" href="https://example.com/fr-fr/products/widget/">
  <link rel="alternate" hreflang="x-default" href="https://example.com/en-us/products/widget/">
  <!-- canonical to self -->
  <link rel="canonical" href="https://example.com/en-us/products/widget/">
</head>

Required properties for HTML head implementation:

Pros:

Cons:

Best for: sites under five locales, sites where editorial control over per-page head is straightforward, sites without sitemap discipline.

5.2 Method 2: XML Sitemap

The XHTML link extension in XML sitemaps lets every URL declare its alternates inside the sitemap entry. The HTML head can omit hreflang entirely if the sitemap version is complete.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <url>
    <loc>https://example.com/en-us/products/widget/</loc>
    <xhtml:link rel="alternate" hreflang="en-US"
                href="https://example.com/en-us/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="en-GB"
                href="https://example.com/en-gb/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="en-CA"
                href="https://example.com/en-ca/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="es-ES"
                href="https://example.com/es-es/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="es-MX"
                href="https://example.com/es-mx/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="de-DE"
                href="https://example.com/de-de/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="fr-FR"
                href="https://example.com/fr-fr/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="x-default"
                href="https://example.com/en-us/products/widget/"/>
  </url>
  <url>
    <loc>https://example.com/en-gb/products/widget/</loc>
    <xhtml:link rel="alternate" hreflang="en-US"
                href="https://example.com/en-us/products/widget/"/>
    <xhtml:link rel="alternate" hreflang="en-GB"
                href="https://example.com/en-gb/products/widget/"/>
    <!-- ... same cluster repeated ... -->
  </url>
</urlset>

Required properties for sitemap implementation:

Pros:

Cons:

Best for: sites with five or more locales, large sites (10,000+ pages), sites with mature build pipelines that can generate sitemaps from a CMS or database, headless sites where the build step controls sitemaps cleanly.

5.3 Method 3: HTTP Link Header

For non-HTML resources (PDFs, images, video files), the HTML head method is not applicable. The HTTP Link header carries the hreflang declaration in the response headers.

HTTP/1.1 200 OK
Content-Type: application/pdf
Link: <https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="en-US",
      <https://example.com/en-gb/whitepaper.pdf>; rel="alternate"; hreflang="en-GB",
      <https://example.com/de-de/whitepaper.pdf>; rel="alternate"; hreflang="de-DE",
      <https://example.com/fr-fr/whitepaper.pdf>; rel="alternate"; hreflang="fr-FR",
      <https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="x-default"

nginx configuration to deliver the header for a PDF cluster:

# /var/www/sites/example/nginx-hreflang-pdf.conf

location = /en-us/whitepaper.pdf {
  add_header Link '<https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="en-US", <https://example.com/en-gb/whitepaper.pdf>; rel="alternate"; hreflang="en-GB", <https://example.com/de-de/whitepaper.pdf>; rel="alternate"; hreflang="de-DE", <https://example.com/fr-fr/whitepaper.pdf>; rel="alternate"; hreflang="fr-FR", <https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="x-default"';
}

The header on every URL in the cluster must list every URL in the cluster including itself.

Bash one-liner to verify a PDF's Link header is correct on Bubbles:

#!/bin/bash
# /var/www/sites/example/check-pdf-hreflang.sh
URL="https://example.com/en-us/whitepaper.pdf"
echo "Checking Link header for $URL"
curl -sI "$URL" | grep -i "^link:" | tr ',' '\n' | sed 's/^[[:space:]]*//'

Pros:

Cons:

Best for: PDFs that exist in multiple locales (whitepapers, product specs, manuals), image assets that are localized (only when the image itself is translated, which is rare), localized video files.

5.4 Mixing Methods

Mixing is permitted by Google. The most common useful mix:

What does not work: declaring different alternates in the HTML head and the XML sitemap for the same URL. The two sources must agree, or Google may consolidate inconsistently. The rule: pick one source of truth (head or sitemap) per resource type, and treat the others as redundant declarations of the same data, not as competing declarations.

5.5 Implementation Method Decision

implementation_method_decision:

  locales_under_5_and_pages_under_1000:
    method: "HTML head only"
    rationale: "simple, directly visible, audit-friendly"

  locales_5_to_20_or_pages_1000_to_50000:
    method: "XML sitemap primary; HTML head optional"
    rationale: "cluster maintenance scales"

  locales_over_20_or_pages_over_50000:
    method: "XML sitemap mandatory; HTML head redundant if templates allow"
    rationale: "head-only is too brittle at scale"

  pdfs_and_non_html_present:
    method: "HTTP Link header for those resources, in addition to chosen HTML method"
    rationale: "only option for non-HTML"

  legacy_site_already_using_one_method:
    method: "stay with current method; do not migrate without audit"
    rationale: "migration during ranking-sensitive periods risks dropping cluster signals"

6. The x-default Tag

6.1 What x-default Does

x-default is a special hreflang value that declares a fallback URL for users whose language and region do not match any of the specific locale tags in the cluster. It does not target a language. It does not target a country. It is the "if none of the above, serve this" pointer.

<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/products/widget/">

Google introduced x-default in April 2013 (Google Search Central Blog, "Introducing x-default hreflang for international landing pages", April 10, 2013). Its purpose: a Polish user reaching a site that has only English, Spanish, and German versions has no language match. Without x-default, Google guesses. With x-default, the site picks.

6.2 When to Use x-default

Use it when:

Skip it when:

6.3 What x-default Should Point To

Three valid patterns:

Pattern A: Country selector page.

<link rel="alternate" hreflang="x-default" href="https://example.com/">

The root URL is a country selector with no localized content of its own. Users see flags or language names and pick. Google indexes this page for unmatched users.

Pattern B: Default locale URL.

<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/products/widget/">

The US English version (or whichever locale is the business's default) is the fallback. Users who do not match any other locale get this version. Same URL as one of the existing hreflang values. This is the most common pattern.

Pattern C: Generic language version.

<link rel="alternate" hreflang="en" href="https://example.com/en/products/widget/">
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/products/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/products/widget/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en/products/widget/">

A generic English version exists separate from any country variant. The generic English is x-default. The country-specific variants exist for users in those countries. The generic English catches everyone else.

6.4 Common x-default Misuses

Misuse 1: x-default to a 404 or redirected URL.

<!-- WRONG: x-default points to a 301 -->
<link rel="alternate" hreflang="x-default" href="https://example.com/global/">
<!-- ...but /global/ redirects to /en-us/ -->

Fix: point x-default directly to the destination URL. The same rule that applies to any hreflang URL applies to x-default: 200 status, canonical, indexable.

Misuse 2: x-default to a different language than any of the cluster URLs.

<!-- WRONG: cluster is English and Spanish, x-default is French -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/">
<link rel="alternate" hreflang="es-ES" href="https://example.com/es-es/">
<link rel="alternate" hreflang="x-default" href="https://example.com/fr-fr/">

x-default's URL should usually appear elsewhere in the cluster as a specific locale. Pointing to a French URL that is not otherwise in the cluster is technically allowed but confusing and rarely intentional.

Misuse 3: missing x-default on a cluster that needs one.

A site with five locales and no x-default falls back to Google's algorithmic choice for unmatched users. Usually the closest language match. Usually fine. Sometimes wrong. x-default is cheap insurance.

Misuse 4: x-default on a single-locale site.

<!-- WRONG: site only has one locale -->
<link rel="alternate" hreflang="en-US" href="https://example.com/">
<link rel="alternate" hreflang="x-default" href="https://example.com/">

A single-locale site does not have a cluster. No hreflang is needed at all, and x-default has nothing to fall back from. Remove both tags.

Misuse 5: x-default that disagrees with the canonical pattern of the target URL.

If x-default points to /en-us/widget/ and that URL's canonical points to /widget/ (a different URL), the x-default declaration is invalid. The target of x-default must be a self-canonical URL.

6.5 The x-default Decision Test

Does the site serve more than one locale?
  NO  -> Skip hreflang and x-default entirely.
  YES -> Continue.

Is there a sensible fallback URL for users not matching any specific locale?
  NO  -> Skip x-default. Let Google pick algorithmically.
  YES -> Use x-default pointing to that URL. Continue.

Is the x-default URL self-canonical, HTTP 200, and indexable?
  NO  -> Fix the URL first, then add x-default.
  YES -> Implement. Validate. Move on.

7. Hreflang on Paginated Series

7.1 The Pagination + Hreflang Interaction

Paginated series (blog archives, category listings, ecommerce catalogs, search results) introduce a second axis of URL multiplication on top of the locale axis. A blog archive with five pages of posts across four locales is 5 * 4 = 20 URLs that are all related but not interchangeable.

The cardinal rule: page N in locale A points to page N in locale B. Not page 1, not the next page, not the canonical version of the series. Page 2 of /en-us/blog/ is the hreflang peer of page 2 of /de-de/blog/. Mismatched page numbers across locales is a common cluster-breaking error.

<!-- On https://example.com/en-us/blog/page/2/ -->
<head>
  <link rel="canonical" href="https://example.com/en-us/blog/page/2/">
  <link rel="alternate" hreflang="en-US" href="https://example.com/en-us/blog/page/2/">
  <link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/blog/page/2/">
  <link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/blog/page/2/">
  <link rel="alternate" hreflang="fr-FR" href="https://example.com/fr-fr/blog/page/2/">
  <link rel="alternate" hreflang="x-default" href="https://example.com/en-us/blog/page/2/">
</head>

7.2 Self-Canonical on Every Paginated Page

Each paginated page in each locale canonicals to itself, not to page 1 of its series. The historical pattern of canonicalizing page 2, 3, 4 to page 1 was discouraged by Google in 2019 when rel=next/prev was deprecated and is now actively incorrect (Google Search Central, "How to specify a canonical with rel=canonical", revision 2023; Lumar SEO, "Canonical Tags Dos and Donts", 2024).

<!-- On https://example.com/en-us/blog/page/2/ -->
<link rel="canonical" href="https://example.com/en-us/blog/page/2/">
<!-- NOT this: -->
<!-- <link rel="canonical" href="https://example.com/en-us/blog/"> -->

Self-canonical lets Google index each page as a distinct entry in the series and lets each page accumulate its own ranking signals if any inbound links land on it.

7.3 What About rel="next" and rel="prev"?

Google deprecated rel=next/prev for pagination signaling in March 2019. The official position since: Google can infer pagination from internal linking patterns; rel=next/prev tags are not consumed (Google Search Central, blog post and subsequent Mueller statements, 2019-2024).

Bing continues to support rel=next/prev "on a case-by-case basis" per Bing Webmaster documentation 2023. Some other consumers may still respect it. The cost of including rel=next/prev is negligible; the benefit is partial. The current pattern most large sites use:

<!-- On https://example.com/en-us/blog/page/2/ -->
<head>
  <link rel="canonical" href="https://example.com/en-us/blog/page/2/">
  <link rel="prev" href="https://example.com/en-us/blog/">
  <link rel="next" href="https://example.com/en-us/blog/page/3/">
  <link rel="alternate" hreflang="en-US" href="https://example.com/en-us/blog/page/2/">
  <link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/blog/page/2/">
  <link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/blog/page/2/">
  <link rel="alternate" hreflang="x-default" href="https://example.com/en-us/blog/page/2/">
</head>

7.4 Asymmetric Pagination Across Locales

A common reality: the English blog has 50 posts across 5 pages of 10. The German blog has only 30 posts across 3 pages of 10. The Spanish blog has 80 posts across 8 pages.

The cluster for page 1 is straightforward: every locale has a page 1. The cluster for page 2 is fine: every locale has a page 2. The cluster for page 4 fails: en-US has page 4, es-ES has page 4, but de-DE does not.

Resolution rules:

Rule A: if a locale does not have the page number, do not include that locale in the hreflang cluster for that page. en-US page 4 lists en-GB page 4 (if exists), es-ES page 4 (if exists), de-DE page 4 (if exists; in our example, does not), x-default. de-DE simply omits.

Rule B: do not point en-US page 4 to de-DE page 1 as a "best fallback." The cluster declares peer relationships, not fallback relationships. Mismatched page numbers fragment the cluster signals.

Rule C: if pagination is so asymmetric that most pages have only one locale, reconsider whether pagination should be in hreflang at all. The canonical pattern: hreflang only on page 1 of each series. Pages 2+ omit hreflang, rely on user navigation. Acceptable when paginated pages are not significant traffic targets.

7.5 Infinite Scroll and View-All Patterns

Sites using infinite scroll without distinct paginated URLs do not have a pagination + hreflang interaction; there is only one URL per locale. Hreflang is straightforward: each locale's blog landing URL points to every other locale's blog landing URL.

Sites with a view-all option (/blog/all/ or ?view=all) treat the view-all URL as its own cluster member. The view-all in en-US is a peer of the view-all in de-DE.

7.6 Faceted Navigation and Filter Combinations

Faceted URLs like /en-us/shop/shirts/?color=red&size=large introduce massive URL explosion. Hreflang in these contexts is operationally impractical unless faceted URLs are deliberately limited. The typical pattern:

For ecommerce specifically, the canonical strategy for faceted navigation is covered in framework-ecommerceseo.md. The hreflang implication: only the canonical surface gets hreflang.


8. Canonicalization vs Hreflang Precedence

8.1 The Conflict Mode

The most ranking-disruptive failure pattern in international SEO: hreflang and canonical declare conflicting URLs for the same content.

The broken pattern:

<!-- On https://example.com/en-gb/widget/ -->
<link rel="canonical" href="https://example.com/en-us/widget/">  <!-- WRONG -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">

The canonical says "I am a duplicate of en-us; index that one." The hreflang says "I am a peer of en-us; index both, show the locale-matched version." Google receives contradictory signals.

The resolution Google applies:

Per Google Search Central and reaffirmed by John Mueller in Office Hours through 2024-2025: when hreflang and canonical conflict, hreflang is silently discarded. The canonical signal wins (Google Search Central, "Localized versions of your pages", revision 2024; Mueller, multiple Office Hours, including the May 2025 statement that hreflang signals are "hints, not guarantees" relative to canonical and other indexability signals).

Practical consequence: the en-GB version is treated as a duplicate of en-US, dropped from the index, and never shown to UK users. UK users see the en-US version even when they search from the UK. The UK locale silently disappears from the SERP.

8.2 The Correct Pattern

Every URL in an hreflang cluster self-canonicals. Every URL in the cluster declares every other URL (including itself) as an hreflang alternate.

<!-- On https://example.com/en-gb/widget/ -->
<link rel="canonical" href="https://example.com/en-gb/widget/">  <!-- self -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/widget/">

<!-- On https://example.com/en-us/widget/ -->
<link rel="canonical" href="https://example.com/en-us/widget/">  <!-- self -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/widget/">

Both URLs canonical to themselves. Both list the cluster. The peer relationship is intact. Google indexes both. Each locale-matched URL appears in its locale SERP.

8.3 When Are Two URLs Actually Duplicates?

Sometimes en-US and en-GB content really is identical. Marketing might use the same English copy across all English-speaking markets to save on translation costs. Is this a duplicate-content problem?

Google's position: identical content across hreflang peers is acceptable. Hreflang exists precisely to allow same-language content to target different regions. Google will not treat the cluster as duplicate if hreflang is correctly declared. The cluster shares ranking signals; the locale-matched URL surfaces in the locale SERP.

What is not acceptable: identical content across the same language and same region (en-US and en-US on different URLs) without canonical resolution. That is a duplicate-content problem unrelated to hreflang.

8.4 The Specific Mistake: "Canonical to the Master, Hreflang to the Variants"

Some agencies historically taught a pattern where a "master" English URL was canonical, and all other locales canonicaled to it while declaring hreflang. The reasoning: "consolidate signal to one URL, distribute discoverability via hreflang."

This pattern is wrong for the reason in Section 8.1: the canonical signal wins, the hreflang is discarded, and only the master URL is ever indexed or shown. The other locales become functionally invisible to search.

This is the most common high-severity hreflang error in SaaS and ecommerce migrations from a single-locale site to multi-locale. The CMS adds locale duplicates with the canonical pointing back to the master. Hreflang is then bolted on. The cluster looks valid on paper but never functions. Symptom in GSC: locale URLs in "Duplicate, Google chose different canonical" coverage status.

8.5 The Verification Test

For every URL in an hreflang cluster, three properties must hold:

  1. Self-canonical: <link rel="canonical" href="THIS_URL"> where THIS_URL is the current URL.
  2. HTTP 200: the URL must return HTTP 200, not 301, 302, 404, or 5xx.
  3. Indexable: no <meta name="robots" content="noindex">, no X-Robots-Tag: noindex HTTP header, not blocked by robots.txt.

A URL failing any of these three properties cannot be in an hreflang cluster. Its inclusion will silently break the cluster.

8.6 The Conflict Resolution Decision

Does the page declare canonical?
  NO  -> Add self-canonical. Continue.
  YES -> Check the canonical target.

Does canonical point to the same URL as the current page?
  YES -> Cluster is correct. Continue.
  NO  -> Canonical points elsewhere. This URL is a duplicate.
         -> Remove hreflang from this URL.
         -> Or change canonical to self and accept the URL as a cluster peer.
         -> Decide based on whether the URL has distinct content per locale.

9. Regional Sub-Variants (en-US vs en-GB vs en-CA)

9.1 The Same-Language Multi-Region Question

When does English content warrant separate en-US, en-GB, en-CA, en-AU, en-IE, en-IN, en-NZ, en-SG, en-ZA URLs versus a single en URL serving all English-speaking countries?

The answer hinges on whether the content actually differs across regions. Hreflang is a structural signal; it cannot manufacture content differentiation that does not exist.

9.2 When to Differentiate

Differentiate (use en-US, en-GB, en-CA as separate URLs with separate content) when:

9.3 When Not to Differentiate

Do not differentiate when:

9.4 The Threshold Test

en_subvariant_threshold:

  required:
    currency_differs: true
    compliance_text_differs: true
    product_availability_differs: true
    physical_office_per_region: true

  recommended:
    spelling_localization_in_scope: true
    examples_localized_in_scope: true
    customer_support_per_region: true

  optional:
    minor_vocabulary_differences: true

  not_sufficient_alone:
    region_targeting_intent: true  # cannot justify subvariants without content differentiation

A site with no item in "required" or "recommended" categories should use a single en URL with no sub-variants. A site with two or more "required" items should differentiate. A site with one "required" item should evaluate case-by-case.

9.5 The es-ES vs es-MX vs es-AR Case

Spanish has the same dynamic. The vocabulary, idiom, and cultural references differ substantially between Spain (Castilian), Mexico (Mexican), Argentina (Rioplatense), and other Latin American countries. The differentiation case is often stronger for Spanish than for English because the linguistic differences are larger.

The mistake to avoid: declaring es-419 for "Spanish, Latin America" (the UN M.49 code for Latin America). Google does not support es-419. Only ISO 639-1 language plus ISO 3166-1 Alpha 2 country codes are accepted. To target Latin America, declare individual country codes (es-MX, es-AR, es-CO, es-CL, es-PE) or use generic es with an x-default fallback.

9.6 The pt-BR vs pt-PT Case

Portuguese is the clearest case for differentiation. Brazilian Portuguese (pt-BR) and European Portuguese (pt-PT) differ substantially in vocabulary, grammar, and spelling after the 1990 orthographic agreement. A Brazilian user served pt-PT content perceives it as a foreign dialect and vice versa. Both Brazil and Portugal warrant separate URLs whenever the site serves both markets.

9.7 The fr-FR vs fr-CA vs fr-BE vs fr-CH Case

French has narrower vocabulary differences than Spanish or Portuguese but stronger administrative differences (currency, compliance, formal address conventions). The differentiation case is mid-strength. France and Quebec usually warrant separation; Belgium and Switzerland are case-by-case based on whether the site has significant operations in those markets.

9.8 The de-DE vs de-AT vs de-CH Case

German has subtle vocabulary differences across Germany, Austria, and Switzerland. The differentiation case is usually administrative (currency: EUR in DE and AT, CHF in CH; tax regimes differ). Differentiate when administrative differences require it; do not differentiate solely on vocabulary unless Swiss German content is explicitly distinct (and the page is genuinely localized).

9.9 Region Without Language (Not Supported)

Hreflang does not support country-only declarations. hreflang="US" is not valid. The language code is required. To target a country with whatever language is appropriate, declare the country with the language explicitly: hreflang="en-US". This is sometimes a surprise to sites that want to target a country regardless of language; they must pick a language.


10. Hreflang Validation

Validation is mandatory before deploy and after deploy. Hreflang errors fail silently in production; the cluster looks fine on paper, and ranking degradation in the affected locale shows up weeks later as a flat or declining trend with no obvious cause.

10.1 Screaming Frog SEO Spider

The reference desktop crawler for hreflang audits. Thirteen dedicated hreflang filters in the Hreflang tab (Screaming Frog SEO Spider documentation, 2025):

To run a full hreflang audit:

1. Configuration > Spider > Crawl Behavior > "Crawl Hreflang"
2. Configuration > Spider > Extraction > confirm "Hreflang" is checked
3. Crawl the site (or a sitemap)
4. Open the Hreflang tab; review every filter that has results
5. Export issues to CSV; correlate against the cluster definitions

For sites larger than the desktop crawler can handle in memory, switch to database storage mode (Screaming Frog SQLite backend) or to Sitebulb.

10.2 Sitebulb

Sitebulb's international audit tools are designed for scheduled cluster validation. The international hints surface:

Sitebulb visualizes hreflang clusters as a directed graph, which is useful when the cluster has more than five locales and the relationship matrix is hard to verify by table.

10.3 hreflang.org Testing Tool

A free web-based tester for an individual URL (app.hreflang.org). Fetches the target URL, parses the hreflang block, follows each declared alternate, and verifies the return tag from each alternate. Single-URL scope; for site-wide audit, use Screaming Frog or Sitebulb.

10.4 Merkle / TechnicalSEO.com Hreflang Tester

Alternative single-URL tester (technicalseo.com/tools/hreflang/). Also supports HTTP header inspection, which the hreflang.org tester does not. Use this one for PDF hreflang validation.

10.5 Aleyda Solis Hreflang Tags Generator

Not a validator; a generator. Used during initial implementation. Input: a list of URLs and the language-country combination for each. Output: HTML head, HTTP header, or XML sitemap version of the cluster. Limits: up to 50 URL variants per session. Available at aleydasolis.com/en/seo-resources-tools/hreflang-tags-generator/. Referenced by Google Search Central documentation as a recommended tool.

10.6 Google Search Console

GSC no longer has the dedicated International Targeting report (deprecated September 22, 2022 per support.google.com/webmasters/answer/12474899). What remains in GSC that bears on hreflang:

10.7 Bash Validation Scripts

Command-line scripts let validation be part of a CI pipeline or a pre-deploy gate. Examples below assume nginx serves the site from /var/www/sites/example/ on Bubbles.

Script 1: Extract hreflang URLs from a sitemap and verify each returns HTTP 200.

#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-sitemap.sh
# Usage: ./hreflang-check-sitemap.sh https://example.com/sitemap-international.xml

SITEMAP_URL="$1"
TMP=$(mktemp)

if [ -z "$SITEMAP_URL" ]; then
  echo "Usage: $0 <sitemap_url>"
  exit 1
fi

echo "Fetching sitemap: $SITEMAP_URL"
curl -sL "$SITEMAP_URL" -o "$TMP"

echo "Extracting all hreflang URLs..."
grep -oP '(?<=xhtml:link[^>]{0,200})href="[^"]+"' "$TMP" \
  | sed 's/href="//;s/"$//' \
  | sort -u > /tmp/hreflang-urls.txt

TOTAL=$(wc -l < /tmp/hreflang-urls.txt)
echo "Found $TOTAL unique hreflang URLs"

echo "Verifying each returns HTTP 200..."
FAILED=0
while IFS= read -r URL; do
  STATUS=$(curl -sI -o /dev/null -w "%{http_code}" "$URL")
  if [ "$STATUS" != "200" ]; then
    echo "FAIL [$STATUS] $URL"
    FAILED=$((FAILED + 1))
  fi
done < /tmp/hreflang-urls.txt

echo "Done. $FAILED of $TOTAL URLs returned non-200."
rm -f "$TMP"

Script 2: Verify self-reference and return-tag bidirectionality for a single URL.

#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-bidirectional.sh
# Usage: ./hreflang-check-bidirectional.sh https://example.com/en-us/widget/

URL="$1"
if [ -z "$URL" ]; then
  echo "Usage: $0 <url>"
  exit 1
fi

echo "Fetching $URL"
HTML=$(curl -sL "$URL")

echo "Extracting hreflang block..."
echo "$HTML" \
  | grep -oP '<link[^>]+rel="alternate"[^>]+hreflang="[^"]+"[^>]*>' \
  > /tmp/hreflang-block.txt

echo "Hreflang declarations on $URL:"
cat /tmp/hreflang-block.txt
echo ""

# Extract URLs from hreflang block
echo "$HTML" \
  | grep -oP '<link[^>]+rel="alternate"[^>]+hreflang="[^"]+"[^>]*>' \
  | grep -oP 'href="[^"]+"' \
  | sed 's/href="//;s/"$//' \
  | sort -u > /tmp/peer-urls.txt

# Check self-reference
if grep -qFx "$URL" /tmp/peer-urls.txt; then
  echo "PASS: self-reference present"
else
  echo "FAIL: self-reference missing"
fi

# Check return tag from each peer
echo "Checking return tags from each peer..."
while IFS= read -r PEER; do
  if [ "$PEER" = "$URL" ]; then continue; fi
  PEER_HTML=$(curl -sL "$PEER")
  if echo "$PEER_HTML" | grep -qF "href=\"$URL\""; then
    echo "PASS: $PEER lists $URL"
  else
    echo "FAIL: $PEER does not list $URL"
  fi
done < /tmp/peer-urls.txt

Script 3: Check canonical-hreflang conflict for a URL.

#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-canonical.sh
# Usage: ./hreflang-check-canonical.sh https://example.com/en-us/widget/

URL="$1"
HTML=$(curl -sL "$URL")

CANONICAL=$(echo "$HTML" \
  | grep -oP '<link[^>]+rel="canonical"[^>]+href="[^"]+"' \
  | grep -oP 'href="[^"]+"' \
  | sed 's/href="//;s/"$//' \
  | head -n 1)

echo "Page URL:      $URL"
echo "Canonical URL: $CANONICAL"

if [ "$CANONICAL" = "$URL" ]; then
  echo "PASS: self-canonical"
else
  echo "FAIL: canonical points to different URL"
  echo "      This URL is treated as a duplicate of $CANONICAL"
  echo "      Hreflang on this page will be silently discarded"
fi

# Check whether the canonical target is also in hreflang
if echo "$HTML" | grep -q "href=\"$CANONICAL\".*hreflang="; then
  echo "WARN: canonical target $CANONICAL is also an hreflang peer"
  echo "      This is the canonical-hreflang conflict pattern"
fi

Script 4: Validate ISO 639-1 language and ISO 3166-1 Alpha 2 country codes.

#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-codes.sh
# Usage: ./hreflang-check-codes.sh https://example.com/en-us/widget/

URL="$1"
HTML=$(curl -sL "$URL")

# Valid ISO 639-1 (subset, common languages)
VALID_LANG="ar|bg|bn|cs|da|de|el|en|es|et|fa|fi|fr|gu|he|hi|hr|hu|id|it|ja|kn|ko|lt|lv|ml|mr|ms|nl|no|pa|pl|pt|ro|ru|sk|sl|sr|sv|sw|ta|te|th|tl|tr|uk|ur|vi|x-default|zh"

# Valid ISO 3166-1 Alpha 2 (subset, common countries)
VALID_COUNTRY="AE|AR|AT|AU|BE|BR|CA|CH|CL|CN|CO|CZ|DE|DK|EG|ES|FI|FR|GB|GR|HK|HU|ID|IE|IL|IN|IT|JP|KR|MX|MY|NL|NO|NZ|PE|PH|PL|PT|RO|RU|SA|SE|SG|SK|TH|TR|TW|UA|US|VN|ZA"

echo "Extracting hreflang values from $URL"
echo "$HTML" \
  | grep -oP 'hreflang="[^"]+"' \
  | sed 's/hreflang="//;s/"$//' \
  | sort -u > /tmp/hreflang-values.txt

ERRORS=0
while IFS= read -r VALUE; do
  if [ "$VALUE" = "x-default" ]; then
    echo "OK   x-default"
    continue
  fi
  # Parse lang[-COUNTRY]
  LANG=$(echo "$VALUE" | cut -d- -f1 | tr '[:upper:]' '[:lower:]')
  COUNTRY=$(echo "$VALUE" | awk -F- '{print toupper($2)}')

  # Check language
  if ! echo "$LANG" | grep -qE "^($VALID_LANG)$"; then
    echo "FAIL invalid language '$LANG' in '$VALUE'"
    ERRORS=$((ERRORS + 1))
    continue
  fi

  # Check country if present
  if [ -n "$COUNTRY" ]; then
    if ! echo "$COUNTRY" | grep -qE "^($VALID_COUNTRY)$"; then
      echo "FAIL invalid country '$COUNTRY' in '$VALUE'"
      ERRORS=$((ERRORS + 1))
      continue
    fi
  fi

  echo "OK   $VALUE"
done < /tmp/hreflang-values.txt

echo "Done. $ERRORS invalid codes found."

For full ISO 639-1 and ISO 3166-1 Alpha 2 lookup tables, see the Hreflang.org valid-codes reference (hreflang.org/list-of-hreflang-codes/) or the IANA registry. The scripts above include common subsets; extend the regex lists per project scope.

10.8 Continuous Validation

Validation is not one-time. Hreflang clusters break over time through routine site operations:

The validation cadence:

hreflang_validation_cadence:

  pre_deploy:
    every_change: "validate the affected cluster before merge"
    automated: "CI runs hreflang-check-bidirectional and hreflang-check-canonical"

  weekly:
    sample_size: "10 percent of pages, rotating coverage"
    tool: "Screaming Frog scheduled crawl or Sitebulb scheduled audit"
    metric: "zero canonical-hreflang conflicts; zero non-200 hreflang URLs"

  monthly:
    sample_size: "full site"
    tool: "Sitebulb full international audit"
    metric: "comprehensive issue inventory; trend over time"

  on_locale_change:
    trigger: "adding or removing a locale"
    scope: "full site"
    tool: "Screaming Frog + manual cluster spot-check on representative pages"

11. Common Hreflang Mistakes

The ten anti-patterns most likely to be present in a 2026 site audit, ranked roughly by frequency in the Ahrefs and Semrush studies (Ahrefs n=374,756 domains 2024; Semrush n=20,000 multilingual sites 2023).

11.1 Missing Self-Reference

Frequency: 16 percent of multilingual sites (Ahrefs 2024).

Anti-pattern:

<!-- On https://example.com/en-us/widget/ -->
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/widget/">
<!-- en-US peer entry missing -->

Fix:

<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/widget/">

Why it breaks: Google's documentation and Mueller's statements treat self-reference as "good practice, technically optional" but in audited behavior, missing self-reference correlates with cluster discard. Always include it.

11.2 Missing Return Tags (Broken Bidirectionality)

Frequency: 31 percent of multilingual sites have conflicting or broken bidirectional declarations (Ahrefs 2024).

Anti-pattern:

<!-- On https://example.com/en-us/widget/ -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">

<!-- On https://example.com/en-gb/widget/ -->
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<!-- no en-US entry -->

Fix: every URL in the cluster lists every other URL.

Why it breaks: Google explicitly states: "If page X links to page Y, page Y must link back to page X. If this is not the case for all pages that use hreflang annotations, those annotations may be ignored or not interpreted correctly" (Google Search Central, "Localized versions of your pages", 2024 revision).

11.3 Invalid Language or Country Codes

Frequency: 8.91 percent of multilingual sites (Ahrefs 2024).

Anti-patterns:

<link rel="alternate" hreflang="en-uk" href="...">   <!-- WRONG: uk is not ISO 3166-1 -->
<link rel="alternate" hreflang="en-eu" href="...">   <!-- WRONG: eu is not a country -->
<link rel="alternate" hreflang="jp" href="...">      <!-- WRONG: jp is not ISO 639-1 (it's ja) -->
<link rel="alternate" hreflang="es-419" href="...">  <!-- WRONG: 419 not supported -->
<link rel="alternate" hreflang="zh-CN" href="...">   <!-- OK (CN is valid) -->
<link rel="alternate" hreflang="zh-Hans" href="..."> <!-- WRONG: Hans is ISO 15924 script, not country -->
<link rel="alternate" hreflang="en-419" href="...">  <!-- WRONG: 419 not supported -->
<link rel="alternate" hreflang="en_US" href="...">   <!-- WRONG: underscore, must be hyphen -->

Fixes:

<link rel="alternate" hreflang="en-GB" href="...">   <!-- UK is GB -->
<!-- there is no EU country code; use individual member states -->
<link rel="alternate" hreflang="ja" href="...">      <!-- Japanese is ja, not jp -->
<!-- for Latin American Spanish, use country codes per market: es-MX, es-AR, etc. -->
<link rel="alternate" hreflang="zh-CN" href="...">   <!-- China -->
<link rel="alternate" hreflang="zh-TW" href="...">   <!-- Taiwan, traditional script implied -->
<link rel="alternate" hreflang="en-US" href="...">   <!-- always hyphen -->

Common pitfalls in code mapping:

Wrong Right Reason
en-uk en-GB UK is not in ISO 3166-1 Alpha 2; GB is
en-eu (multiple) EU is not a country; use per-country
jp ja Japanese ISO 639-1 is ja
cn zh Chinese ISO 639-1 is zh
kr ko Korean ISO 639-1 is ko
es-419 (per-country) 419 is UN code, not ISO 3166-1
en_US en-US Hyphen, not underscore
en-GBR en-GB Alpha 2, not Alpha 3
en-USA en-US Alpha 2, not Alpha 3

11.4 Hreflang to Non-200 or Non-Canonical URLs

Frequency: common; specific statistic varies by study.

Anti-pattern:

<!-- hreflang points to a URL that 301 redirects -->
<link rel="alternate" hreflang="en-GB" href="https://example.com/uk/widget">
<!-- but /uk/widget 301s to /en-gb/widget/ -->

Fix:

<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">

Why it breaks: Google may still partially process hreflang to a redirect (Mueller stated in 2018 and reaffirmed 2025 that hreflang to a 301 is "probably OK" but should be automated to follow the redirect target). Best practice: point directly to the destination URL. A hreflang to a 404 or 5xx URL is definitively broken.

11.5 Canonical-Hreflang Conflict

Frequency: common; severity high.

Covered in detail in Section 8. The pattern: hreflang declares peer; canonical declares same peer as the master, making this URL a duplicate. Google discards hreflang. Locale-matched URL never appears in its SERP.

Fix: every URL in the cluster self-canonicals. Canonical never points to an hreflang peer.

11.6 Hreflang in Body Instead of Head

Anti-pattern:

<head>
  <title>Widget</title>
</head>
<body>
  <link rel="alternate" hreflang="en-GB" href="...">  <!-- WRONG -->
  <h1>Widget</h1>
</body>

Why it breaks: Google explicitly states that <link rel="alternate"> tags only count when in <head>. Body-located tags are ignored.

Fix: move all hreflang tags to <head>.

11.7 Mixing HTML Head and Sitemap Declarations That Disagree

Anti-pattern: the HTML head on /en-us/widget/ lists six peers; the XML sitemap entry for /en-us/widget/ lists eight peers. The two sources disagree about which URLs are in the cluster.

Fix: pick one source of truth. Either remove the HTML head hreflang (use sitemap only) or remove the sitemap hreflang (use HTML head only) or keep both in sync. The maintenance burden of keeping both in sync is real; for sites with frequent locale changes, the sitemap-only pattern is operationally safer.

11.8 Auto-Redirect by IP Replacing Hreflang

Anti-pattern: the site auto-redirects users by IP geolocation. A user in Germany requesting /en-us/widget/ is server-redirected to /de-de/widget/. Googlebot (which crawls from US IPs) requesting /de-de/widget/ from the UK locale's hreflang cannot reach the URL because the server redirects it back to /en-us/.

Why it breaks: Googlebot never sees the locale-specific content; the cluster is invisible to Google. Hreflang declarations point to URLs Googlebot cannot reach. Mueller has stated this pattern blocks effective indexing of non-default locales (Search Central Office Hours, multiple, 2019-2024).

Fix: do not auto-redirect by IP. Offer a banner ("It looks like you are in Germany; view the German site?") with an explicit link. Let the user choose. Googlebot indexes every locale because every URL is directly reachable.

11.9 Same-Language Without Region or x-default

Anti-pattern:

<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/">
<link rel="alternate" hreflang="en-CA" href="https://example.com/en-ca/">
<!-- no x-default; no generic en -->

A user in India searching in English has no specific match (en-IN is not in the cluster). Google must pick algorithmically. The site loses control over which version appears.

Fix: add x-default pointing to the most appropriate fallback (typically the global English or en-US).

<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/">

11.10 JavaScript-Injected Hreflang

Anti-pattern: the hreflang block is generated client-side by JavaScript, not present in the first-byte HTML.

<!-- First byte response -->
<head>
  <title>Widget</title>
  <!-- no hreflang -->
</head>
<!-- ...later, client JS injects: -->
<script>
  document.head.appendChild(/* hreflang link tag */);
</script>

Why it breaks: AI crawlers do not execute JavaScript (framework-contentfirst.md, Section 4.1). Many SEO crawlers (Screaming Frog by default, Sitebulb without rendering enabled) do not execute JavaScript. Googlebot's JavaScript rendering is reliable but delayed; indexing decisions can be made on the unrendered HTML before the hreflang is injected. The cluster is invisible to a portion of the consumer surface.

Fix: server-render hreflang in the first byte. For Next.js, use generateMetadata in App Router or getStaticProps in Pages Router (framework-nextjs.md). For React SPAs, prerender with a build step that emits per-locale HTML. For WordPress, ensure the multilingual plugin (Polylang, WPML) writes to the head template, not to a post-render JavaScript hook.


12. Monitoring and Maintenance

12.1 The Monitoring Stack After GSC Deprecation

The GSC International Targeting report was the centerpiece of hreflang monitoring before September 2022. Since deprecation, the monitoring stack is third-party plus GSC's remaining tools:

monitoring_stack_2026:

  primary_crawl_validator:
    tool: "Screaming Frog SEO Spider or Sitebulb"
    cadence: "weekly scheduled crawl with hreflang export"
    metric: "zero new errors compared to last week's baseline"

  gsc_url_inspection:
    purpose: "spot check that deployed pages carry the hreflang block"
    cadence: "after every deploy that touches international templates"

  gsc_coverage_pages_report:
    purpose: "detect duplicate-canonical issues that indicate hreflang break"
    cadence: "weekly review of new entries"
    filter: "look for 'Duplicate, Google chose different canonical' for locale URLs"

  gsc_performance_by_country:
    purpose: "verify locale URLs receive impressions from target country"
    cadence: "monthly; compare 28-day trends per locale"

  per_locale_gsc_property:
    purpose: "isolate metrics per locale segment"
    recommendation: "set up property for each subdirectory or subdomain"
    example_properties:
      - "https://example.com/en-us/"
      - "https://example.com/en-gb/"
      - "https://example.com/de-de/"
      - "https://example.com/fr-fr/"

  third_party_rank_tracker:
    tool: "Semrush, Ahrefs, AccuRanker; configured per target country"
    purpose: "verify rankings are happening on the locale-matched URL"
    cadence: "weekly"

  log_analysis:
    tool: "nginx access logs on Bubbles, parsed for Googlebot per locale"
    purpose: "verify Googlebot crawls all locale URLs at expected rate"
    cadence: "monthly review; quarterly deep dive"

12.2 Weekly Checks

weekly_hreflang_checks:

  cluster_health:
    - "screaming frog scheduled crawl with hreflang export"
    - "compare current errors to last week"
    - "investigate any new errors"

  new_locale_coverage:
    - "if a new locale was added in the past 7 days, full cluster spot-check"
    - "verify return tags from every existing locale to the new locale"
    - "verify the new locale's pages list every existing locale"

  removed_url_audit:
    - "any URL removed (deleted, 410'd, 301'd) in past 7 days"
    - "search the rest of the site for residual hreflang references"
    - "fix dangling peers"

  gsc_inspection:
    - "spot check 5 representative pages per locale"
    - "verify URL Inspection shows hreflang in rendered HTML"

  duplicate_canonical_alert:
    - "GSC Coverage > Pages > 'Duplicate, Google chose different canonical'"
    - "if locale URLs appear in this bucket, cluster has broken"
    - "investigate immediately; rank loss within 2-4 weeks if not fixed"

12.3 Monthly Checks

monthly_hreflang_checks:

  full_site_crawl:
    - "Sitebulb or Screaming Frog full crawl"
    - "compare to last month's baseline"
    - "trend per error type"

  performance_by_country:
    - "GSC Performance > Search Results > filter by country"
    - "verify each target country sees its locale URLs as primary landing pages"
    - "investigate any locale where the wrong country URL is ranking"

  log_review:
    - "nginx access logs parsed for Googlebot per locale"
    - "verify crawl rate is proportional to locale size"
    - "investigate any locale with significantly under-rate crawling"

  competitor_check:
    - "spot check 2-3 competitors in each major market"
    - "verify they are using hreflang correctly"
    - "note any structural changes worth considering"

12.4 Change Management for New Markets

Adding a locale is the most error-prone routine hreflang operation. The pattern that consistently works:

new_locale_rollout:

  step_1_pre_launch:
    - "build all new locale URLs to spec"
    - "self-canonical, HTTP 200, indexable, content complete"
    - "build new locale's hreflang block listing all existing locales + self"
    - "do NOT yet update existing locales' hreflang to include the new one"

  step_2_internal_verification:
    - "Screaming Frog crawl of new locale URLs only"
    - "verify cluster is well-formed within the new locale"
    - "fix any errors before exposure"

  step_3_existing_locale_update:
    - "deploy hreflang changes to existing locales' templates"
    - "every existing locale's pages now list the new locale as a peer"
    - "deploy in a single change set, not piecemeal"

  step_4_sitemap_update:
    - "regenerate XML sitemap with new locale URLs"
    - "submit updated sitemap to GSC"

  step_5_post_launch_validation:
    - "Screaming Frog full crawl"
    - "verify zero broken return tags across all locales"
    - "verify new locale URLs appear in URL Inspection with correct hreflang block"
    - "submit new locale URLs via IndexNow if Bing-targeted"

  step_6_monitoring:
    - "weekly cluster check for first month"
    - "GSC Performance check for new locale country at 2 weeks and 4 weeks"
    - "investigate any unexpected drop in existing locale impressions"

12.5 Change Management for Removed Markets

Removing a locale is less common but equally risky. The pattern:

removed_locale_rollout:

  step_1_remove_from_other_locales:
    - "update every other locale's hreflang to drop the dying locale"
    - "deploy in a single change set"

  step_2_410_or_redirect:
    - "decide: 410 (gone permanently) or 301 (consolidate to another locale)"
    - "410 if no replacement content"
    - "301 to closest matching locale if user value preserved (e.g., en-CA -> en-US)"

  step_3_sitemap_cleanup:
    - "regenerate XML sitemap without the removed locale URLs"
    - "submit updated sitemap to GSC"

  step_4_validation:
    - "Screaming Frog full crawl"
    - "verify no residual hreflang references to dead locale"

  step_5_log_review_4_weeks:
    - "verify Googlebot stops requesting the dead URLs within 4 weeks"
    - "verify no residual rankings on dead locale URLs"

13. Audit Rubric

The audit rubric is per-page (a sample) plus site-wide plus 90-day post-deploy verification.

13.1 Per-Page Audit (10 Items)

For a sample of representative pages, one per locale, plus the homepage, plus one paginated page if applicable.

# Criterion Pass / Fail
HL1 Hreflang block present in <head> (or XML sitemap entry, or HTTP header)
HL2 Self-reference present in cluster
HL3 Every declared peer URL returns HTTP 200
HL4 Every declared peer URL is self-canonical
HL5 Current page is self-canonical (canonical points to current URL)
HL6 Every declared peer URL declares the current page in return
HL7 Language codes are valid ISO 639-1
HL8 Country codes (where present) are valid ISO 3166-1 Alpha 2
HL9 x-default present (if cluster has multiple locales) and points to a valid URL
HL10 Tags render in first-byte HTML, not injected by JavaScript

Score per page: 10. Threshold: 10 of 10 pass. A single failure means the cluster is partially broken for that page.

13.2 Site-Wide Audit (10 Items)

# Criterion Pass / Fail
HS1 URL structure decision documented and consistent (ccTLD, subdir, subdomain)
HS2 Implementation method decision documented (head, sitemap, header)
HS3 XML sitemap includes all locale URLs (if sitemap method)
HS4 No canonical-hreflang conflict across audited sample (1000+ URL crawl)
HS5 Screaming Frog or Sitebulb full crawl yields zero broken return tags
HS6 Zero hreflang to non-200 URLs across full crawl
HS7 Zero invalid language or country codes across full crawl
HS8 No JavaScript-injected hreflang (every page renders in first byte)
HS9 x-default policy applied consistently across cluster types
HS10 No IP-based auto-redirects between locales

Score: 10. World-class: 10 of 10. Threshold for "shipped": 8 of 10 with HS4, HS5, HS6 mandatory pass.

13.3 First 90 Days Post-Deploy (8 Checkpoints)

Day Check
Day 1 Screaming Frog crawl matches pre-deploy expectations; zero new errors
Day 3 GSC URL Inspection on 5 representative pages per locale shows correct hreflang in rendered HTML
Day 7 nginx access logs confirm Googlebot is crawling all locale URLs
Day 14 GSC Performance > Country filter shows impressions in target countries on target URLs
Day 21 Sitebulb scheduled audit confirms cluster is stable
Day 30 Comparison of organic landing pages by country in GA4: locale-matched URLs are the primary landing pages per country
Day 60 Rank tracker per country shows locale URLs ranking, not the wrong-locale URL
Day 90 Full re-audit: any drift from baseline triggers cluster repair

13.4 Audit Decision Tree

Is hreflang present anywhere on the site?
  NO  -> Decide: does site need hreflang?
         YES -> Full install (Sections 4-11).
         NO  -> Skip framework.
  YES -> Continue.

Does Screaming Frog or Sitebulb crawl show return-tag errors?
  YES -> Cluster is broken. Repair to zero return-tag errors before any other work.
  NO  -> Continue.

Are any URLs in "Duplicate, Google chose different canonical" in GSC?
  YES -> Canonical-hreflang conflict probable. Section 8 remediation.
  NO  -> Continue.

Do all locale URLs return HTTP 200?
  YES -> Continue.
  NO  -> Fix non-200 URLs in cluster.

Are all language and country codes valid?
  YES -> Continue.
  NO  -> Fix code errors. Common: en-uk -> en-GB.

Is x-default present where needed?
  YES -> Continue.
  NO  -> Add x-default per Section 6.

Are tags in first-byte HTML?
  YES -> Continue.
  NO  -> Server-render hreflang. See framework-contentfirst.md.

Site passes audit. Move to monitoring per Section 12.

13.5 Failure Severity Classification

failure_severity:

  critical_blocks_indexing:
    - canonical_hreflang_conflict
    - hreflang_in_body_only
    - javascript_injected_only
    - all_peer_urls_404
    impact: "cluster is invisible to Google for affected URLs"
    timeline: "fix within 24 hours; rank loss within 2-4 weeks"

  high_partial_function:
    - missing_self_reference
    - broken_return_tags
    - invalid_iso_codes
    - hreflang_to_redirect
    impact: "cluster partially discarded; some locales not surfacing"
    timeline: "fix within 1 week"

  medium_suboptimal:
    - missing_x_default
    - inconsistent_method_across_pages
    - sitemap_out_of_sync_with_head
    impact: "edge cases not optimally handled"
    timeline: "fix within 4 weeks"

  low_cosmetic:
    - uppercase_lowercase_inconsistency_in_codes
    - extra_whitespace_in_tags
    impact: "no functional impact; tidiness issue"
    timeline: "fix during next template refactor"

14. Maintenance Schedule and Report Templates

14.1 Maintenance Schedule

hreflang_maintenance_schedule:

  every_deploy:
    - "CI pipeline runs hreflang-check-bidirectional.sh on changed URLs"
    - "CI pipeline runs hreflang-check-canonical.sh on changed URLs"
    - "CI pipeline runs hreflang-check-codes.sh on changed templates"
    - "fail the deploy if any script returns errors"

  weekly:
    - "Screaming Frog scheduled crawl with hreflang export"
    - "compare to previous week's baseline"
    - "GSC Coverage review for duplicate-canonical entries"
    - "URL Inspection spot check on 5 pages per locale"

  monthly:
    - "full Sitebulb or Screaming Frog crawl"
    - "GSC Performance by country review"
    - "nginx log review for crawl coverage per locale"
    - "competitor hreflang spot check (2 to 3 competitors)"
    - "report generation (Section 14.3)"

  quarterly:
    - "full audit per Section 13"
    - "policy review: are URL structure decisions still right"
    - "report generation (Section 14.4)"

  annually:
    - "review locale strategy: add markets, drop markets, deepen localization"
    - "review URL structure: is ccTLD vs subdir vs subdomain still right"
    - "verify ISO codes have not changed (rare but possible)"

14.2 New Locale Launch Schedule

new_locale_launch_schedule:

  week_minus_4:
    - "decide locale (language plus country)"
    - "decide URL structure for new locale"
    - "translation and localization begins"

  week_minus_2:
    - "all new locale URLs built and live in staging"
    - "Screaming Frog crawl of staging confirms cluster well-formed"
    - "existing locale templates updated in staging to include new peer"

  week_minus_1:
    - "staging full validation"
    - "GSC sandbox property set up for new locale"

  week_zero_launch:
    - "deploy"
    - "sitemap updated and submitted to GSC"
    - "URL Inspection on representative pages"
    - "IndexNow submission if Bing targeting matters"

  week_plus_1:
    - "Screaming Frog crawl of production"
    - "GSC Coverage check for new URLs entering index"
    - "rank tracker baseline established for new country"

  week_plus_2:
    - "Performance by country check"
    - "verify Googlebot is crawling new locale URLs"

  week_plus_4:
    - "first month-end report"
    - "investigation of any unexpected behavior"

  week_plus_12:
    - "first quarter retrospective"
    - "decision: is the new locale tracking expected trajectory"

14.3 Monthly Report Template

# Hreflang Health Report for Month YYYY-MM

## Cluster Health

- Total locale URLs: N
- Total clusters: M
- Screaming Frog hreflang error count: X (previous month: Y, delta: Z)

## Error Breakdown

| Error Type | Count This Month | Count Previous | Delta | Severity |
|---|---|---|---|---|
| Missing self-reference | | | | High |
| Missing return links | | | | High |
| Non-200 hreflang URL | | | | High |
| Invalid language code | | | | High |
| Canonical-hreflang conflict | | | | Critical |
| Missing x-default | | | | Medium |

## Performance by Country

| Country | Impressions | Clicks | Locale-Matched URL Rate |
|---|---|---|---|
| US | | | percent |
| GB | | | percent |
| DE | | | percent |
| FR | | | percent |

## Coverage by Locale

| Locale | URLs in Index | Pages with Errors | Locale Health |
|---|---|---|---|
| en-US | | | |
| en-GB | | | |
| de-DE | | | |
| fr-FR | | | |

## Actions This Month

- Fixes deployed: list
- New errors discovered: list
- Open items for next month: list

## Recommendations

- short list

14.4 Quarterly Audit Report Template

# Hreflang Quarterly Audit for QN YYYY

## Executive Summary

- One-paragraph summary of cluster health, key changes, and trajectory.

## Audit Rubric Scores

- Per-page audit: average X / 10 across sample of N pages
- Site-wide audit: X / 10
- 90-day post-deploy checkpoints: list pass/fail

## Strategic Questions

- Is the URL structure still right for the business in Quarter N?
- Are there markets to add or drop?
- Is the implementation method still right (head vs sitemap vs header)?
- Are there pagination changes that require hreflang updates?

## Issue Trend

- Critical errors: 13-week trend
- High errors: 13-week trend
- New issue types introduced this quarter: list

## Performance by Country

- Per-country impressions trend
- Per-country click trend
- Per-country locale-matched URL surface rate

## Recommendations

- Short list with owners and deadlines

14.5 Incident Response Template

When a cluster breaks unexpectedly (rank loss in a locale, sudden duplicate-canonical errors, etc.):

# Hreflang Incident YYYY-MM-DD

## Symptom

- What was observed (rank loss, GSC alert, third-party tool flag).
- When first noticed.
- What locale(s) affected.

## Diagnosis

- Screaming Frog or Sitebulb diagnosis result.
- Specific error type identified.
- Probable cause (recent deploy, template change, locale change).

## Impact Assessment

- Number of URLs affected.
- Estimated traffic loss (per locale).
- Estimated revenue loss (per locale).

## Remediation

- Fix applied.
- Deploy date.
- Validation result.

## Post-Mortem

- Root cause.
- Why was this not caught in pre-deploy validation.
- Process change to prevent recurrence.

End of Framework Document

Document version: 1.0

Companion documents:

Phase 2 siblings scheduled: framework-multilingual-content, framework-localization-process, framework-cross-border-ecommerce, framework-geo-targeting-without-hreflang.

Sources cited in this framework:

End of framework.

Want this framework implemented on your site?

ThatDevPro ships these frameworks as productized services. SDVOSB-certified veteran owned. Cassville, Missouri.

See Engine Optimization service ›