If you have a multilingual website and duplicate content issues, you’re on the right page.
We’ll show you how hreflang duplicate content can save the day, discussing:
- What hreflang is and how it works
- When to use hreflang and how to write it correctly [codes included]
- When Google can ignore your hreflang tags or even penalize you for it
- And how to solve that issue, too.
We also share plenty of insider tips from our experience as an SEO agency.
Let’s dive in.
TL;DR: Hreflang Duplicate Content
- Purpose: Addresses duplicate content issues in multilingual websites using hreflang tags.
- What is Hreflang?: An HTML tag that helps search engines understand a page's language and regional targeting, crucial for international SEO.
- Function: Prevents duplicate content issues in multilingual sites by presenting the correct page version to the appropriate audience.
- When to Use Hreflang: Necessary for websites with content in multiple languages or tailored for different regions to guide search engines.
- Preventing Duplicate Content Penalties: Correct use of hreflang tags signals intentional duplication for different language-speaking or regional audiences, avoiding typical penalties.
- Writing Hreflang Tags: Includes rel="alternate", language and country codes, and a URL for the alternate version; use correct ISO codes and include self-referential tags.
- Where to Add Tags: Implement in sitemaps, HTTP headers, or directly in HTML; each method has specific use cases.
- Common Mistakes: Misusing hreflang for identical content across regions, technical errors in tag formatting, and inconsistency in hreflang tags can lead to penalties or being ignored by Google.
- Detecting International Duplicate Content: Use Google Search Console and SEO tools like Screaming Frog to identify issues.
- Fixing Duplicate Entries: Manually for small websites or automated solutions for large-scale sites; involves identifying and removing duplicate hreflang tags.
- Key Takeaway: Proper use of hreflang enhances user experience and SEO, but misuse or technical errors can lead to penalties. Seek professional help if needed.
What Is Hreflang?
Hreflang is simultaneously the cornerstone and bane of international SEO strategies.
Let’s explain.
The hreflang attribute is an essential HTML tag that tells search engines about a web page’s linguistic and geographical targeting.
Its primary function is to prevent the common issue of duplicate content in multilingual websites.
That way, the right page version is presented to the right audience.
Insider tip: Multilingual is the keyword here and should not be confused with international.
As an SEO company, we’ve seen many clients do it and fail.
But more on that in a second.
For now, let’s answer this question.
How Does Hreflang Work?
The hreflang attribute is added to an HTML document’s header, sitemap, or HTML.
It includes language and regional codes, like en-CA for English, Canada, or es-ES for Spanish, Spain.
Implementing these tags correctly allows search engines to crawl and index multiple versions of similar content without penalizing the site for duplicate content.
Insider tip: Hreflang has a huge impact on user experience.
We’ve seen it in our SEO practice over and over again.
Hreflang enhances the browsing experience by directing users to the content in their preferred language and regional setting.
This localization:
- Improves user engagement.
- Boosts the relevance of the content.
And that’s how it can lead to better search engine rankings.
How Do I Know If I Need Hreflang?
You need hreflang when you have content in multiple languages or tailored for different regions.
In this case, hreflang guides search engines in understanding these variations.
For instance, if you have an English page for UK audiences and a Polish page for audiences in Poland, hreflang tags help search engines discern between these two.
Therefore, search engines like Google serve the appropriate website version based on the user’s location and language preferences.
Does the Hreflang Tag Prevent Duplicate Content?
Using the hreflang tag can prevent duplicate content if you using it correctly.
In this case, you inform search engines like Google that certain pages are indeed duplicates in terms of their content, but they are not “duplicate content” in a negative sense.
These pages are simply tailored for various language-speaking or regional audiences.
Here, hreflang tags act as a signal to search engines.
They indicate that the seeming duplication is intentional and beneficial for users.
This understanding helps prevent the typical penalties associated with duplicate content, such as lower rankings or the omission of pages from search engine indexes.
Insider tip: Hreflang does not remove or hide duplicate content from search engines.
We’ve seen companies try to use hreflang like that, and it never worked.
Instead, hreflang just guides search engines to index and serve the correct language or regional version of the content to users based on their settings or location.
How to Write Hreflang
Now that you understand the importance of hreflang let’s see how you can write it correctly.
First, you should know that each hreflang tag contains three essential elements:
- The rel=“alternate” attribute: This attribute specifies that the linked document is an alternate version of the page aimed at a different audience (based on language or region).
- Language and country codes: This code identifies the target language and specifies a particular regional dialect or audience. For example, fr-ca indicates French language content tailored for a Canadian audience.
- A URL for the most appropriate alternative: This is the actual link to the page that best serves users in the specified language and region.
- X-default attribute: The x-default attribute is used within hreflang annotations in HTML to specify a default URL intended for all users when no other URL is more appropriate. It’s used mainly in international SEO to guide search engines to display the default version of a website when the user’s language or regional preferences do not match any of the specified hreflang tags. For example:
<link rel="alternate" hreflang="x-default" href="https://www.example.com/" />
Examples of Hreflang Tags
Let’s look at some practical examples to understand how hreflang tags are structured for different scenarios:
Specific Language and Country Combination
<link rel="alternate" hreflang="fr-fr" href="https://www.example.fr/" />
This tag directs French-speaking users in France to the French version of the website.
Generic Language Site
<link rel="alternate" hreflang="en" href="https://www.example.com/" />
Here, the tag targets all English-speaking users, regardless of their country.
And here’s another language example:
<link rel="alternate" hreflang="de" href="https://www.example.com/de/" />
This tag is for all German-speaking users without specifying a particular country.
Best Practices for Writing Hreflang Tags
- Use the correct ISO codes for languages and countries.
- Each page should include a hreflang tag for itself (a self-referential tag), indicating its own language and regional targeting.
- Keep your hreflang tags updated to reflect changes in your website’s content structure or target audience.
Insider tip: As an SEO agency, we have also implemented hreflang tags via XML sitemaps or HTTP headers. We advise you to do the same, especially if you have a large, complex website.
That brings us to the next point.
Where to Add Your Hreflang Tags
Implementing hreflang tags correctly ensures that search engines accurately serve your content to the intended audience.
We use three primary methods to add hreflang tags: in your sitemaps, in the HTTP response headers, or directly on the webpage itself.
Let’s see how that looks:
1. Hreflang in Your Sitemaps
Adding hreflang tags to your sitemap is often considered the best practice, as it avoids adding extra load to each page request. Here’s how you can implement it for Spanish and French versions of a website:
xml
<url>
<loc>https://www.example.com/espanol/</loc>
<xhtml:link rel="alternate" hreflang="es" href="https://www.example.com/espanol/" />
<xhtml:link rel="alternate" hreflang="fr" href="https://www.example.com/francais/" />
</url>
In this example, the sitemap entry notifies Google about the Spanish version of the page and its French alternative. This method must be replicated for each page on your site and all international versions.
2. Hreflang in Your HTTP Headers
Hreflang tags can also be added to the HTTP headers of your web pages. This method is particularly useful for non-HTML files like PDFs. Here’s an example for Spanish and French pages:
bash
Link: <https://www.example.com/espanol/>; rel="alternate"; hreflang="es"
Link: <https://www.example.com/francais/>; rel="alternate"; hreflang="fr"
Each link header specifies the alternate version of the content in a different language.
3. Hreflang in Your HTML
Finally, you can place hreflang tags directly in the <head> section of your HTML. This method is straightforward and widely used. Here’s how it looks for Spanish and French pages:
html
<link rel="alternate" hreflang="es" href="https://www.example.com/espanol/" />
<link rel="alternate" hreflang="fr" href="https://www.example.com/francais/" />
Each <link> tag in the HTML head indicates an alternative version of the page in a different language.
Can Google Take Down Your Hreflang?
We’ve seen Google disregard hreflang tags in different scenarios across our clients. So, ensure you’re not making the same mistakes.
1. Your Website Is International (But Not Multilingual)
Firstly, let’s clarify the often-confused concepts of multilingualism and internationalism.
- A multilingual website operates in more languages.
- An international website operates across many countries.
Remember: Not all international websites are multilingual. For example, Australian users, British-based users, and Canadian users all speak the English language, even if your international business spans different continents.
Insider tip: If your content is in the same language variation and is identical across different region variants (like “en-US,” “en-GB,” and “en-AU” for English speakers in different countries), using hreflang is not a valid strategy because:
- Hreflang tags signal to Google that content variants have been specifically tailored for different native languages or regions.
- Misusing hreflang to geo-target identical content pieces can be seen as an attempt to manipulate search rankings.
Warning: If Google detects that hreflang tags are being misused to mask identical content across different regional versions, it can lead to penalties.
We’ve talked to many business owners who made this mistake and faced lower rankings or even the removal of duplicate pages from the search index.
Plus, sending confusing signals to Google is never a good idea.
Insider tip: You can have different content translations in separate languages.
Here’s Google explaining it themselves:
2. You’re Making Technical Mistakes
These errors include:
- Incorrect language and country code sequence: The proper format is the language code followed by the country code. Swapping these can lead to the tag being overlooked. For example, writing “ca-fr” is wrong if you want to target the Canadian market who speaks French – the correct version is “fr-ca.”
- Using only a country code without a language code: A language code must always precede a country code. Google does not derive a default language from a standalone country code. For example, using just "de" for the German version without a single language code is a mistake.
- Using broad regional codes instead of specific country codes: Google prefers specific country codes over regional or continental codes. An incorrect version uses es-la for Latin America instead of a specific country code like es-mx for Mexico.
- Canonicalizing a page with hreflang tags: If a page with hreflang tags also has a canonical tag pointing to a different page, it may not be indexed correctly and can be excluded from search results.
- Inconsistent hreflang tags: Ensure consistency in your hreflang tags for each language version and country combination. For example, specifying two different URLs as the alternative for it-it (Italian for Italy) will cause both to be ignored.
- Broken or disallowed hreflang URLs: URLs specified in hreflang tags that are broken or blocked in robots.txt will be disregarded.
- Lack of self-referential hreflang tags: Every set of hreflang tags should include a reference to the URL and language-country code of the page itself. For example, a Spanish page targeting Spain (es-es) should include <link rel="alternate" hreflang="es-es" href="http://www.example.es/pagina" /> as part of its hreflang set.
How to Find International Duplicate Content Issues?
If you don’t want to attract Google’s wrath, you need to sleuth out international duplicate content.
We’ll take you through two methods we’re also using at our SEO agency:
1. Use Google Search Console
Google Search Console (GSC) is a powerful tool for detecting international duplicate content. Here’s how you can use it:
- Access the GSC performance report for your website: This report shows how your pages perform in different countries.
- Filter by country and directory: If your website uses directories for regional content (like /fr/ for France or /jp/ for Japan), apply filters in GSC to analyze how these pages perform in different countries. For example, you can check if pages in the /fr/ directory are appearing in search results in Japan, which they shouldn’t.
- Review non-dominant country performance: For websites with multiple international domains, analyze the performance of each country domain in non-dominant countries. For instance, in your French domain, check which pages or queries are appearing in Japanese search results.
- Identify ranking anomalies: Look for instances where a page intended for one region is outranking the local page in another region. For example, a page in the /fr/ directory outranking the same page in the /jp/ directory in Japanese search results indicates a duplicate content issue.
2. Use a SEO Tool for Duplicate Content Checks
Screaming Frog is a popular SEO tool with a duplicate content checker feature. Here’s how to use it:
- Configuring the tool: Go to Configuration > Content > Duplicates in Screaming Frog. You can adjust the duplication percentage threshold; a common setting is 90% to account for minor localized differences.
- Crawl your domain or directories: If your website is organized into directories for each region (like /it/ for Italy or /es/ for Spain), crawling your domain with Screaming Frog will identify duplicate content across these directories.
- Crawl multiple websites: If you operate multiple websites for different regions, you can crawl them either by listing all URLs across domains or by including the XML sitemap for each domain in your crawl configuration.
Example Scenario
Imagine you have a website with different regional sections for Italy, Spain, and Japan. By applying these methods, you might find:
- In Google Search Console, pages from the /it/ directory are appearing in search results in Spain, indicating a potential issue.
- Screaming Frog’s content duplication checker reveals a high percentage of similarity between content in the /es/ and /jp/ directories, suggesting a need for more localized content in these regions.
How to Fix Duplicate Entries
Duplicate hreflang entries can be problematic for SEO, leading to confusion for search engines and potentially impacting your site’s performance.
Fortunately, fixing duplicate hreflang entries is straightforward, even at scale.
Here’s what you can consider:
Fixing Duplicate Entries Manually
If you have a smaller website or direct access to your site’s code through a CMS (Content Management System), you can manually remove duplicate hreflang tags. Here’s a simplified process:
- Identify duplicate hreflang entries: Review the source code of your web pages or use the SEO tools above that can identify hreflang issues.
- Remove unnecessary tags: You can work directly in your pages’ HTML code or in the sitemap, depending on where the tags are located.
- Update your CMS or sitemap: After removing the duplicates, update your CMS or sitemap to reflect these changes. This ensures that the corrected version is what search engines will crawl and index.
Automated Solutions for Large-Scale Websites
For larger websites, manually fixing hreflang tags can be time-consuming.
In such cases, automated solutions like SEO execution platforms can be invaluable. Here’s how most of these platforms work:
- In your automation platform, choose the element you need to optimize. In this case, it’s the hreflang tags in your site’s code.
- Select the action to be taken – in this case, "Delete" the duplicate hreflang entries.
- Specify the XPath location of the duplicated hreflang entries that need removal. XPath is a language that navigates elements and attributes in an XML document (like an HTML page).
Example Scenario
Imagine you have a website with duplicate hreflang tags for Spanish (Spain) and French (France) versions:
html
<link rel="alternate" hreflang="es-es" href="https://www.example.com/es/" />
<link rel="alternate" hreflang="es-es" href="https://www.example.com/es/" />
<link rel="alternate" hreflang="fr-fr" href="https://www.example.com/fr/" />
<link rel="alternate" hreflang="fr-fr" href="https://www.example.com/fr/" />
In this case, you would identify and remove the second occurrence of each hreflang tag to resolve the duplication:
html
<link rel="alternate" hreflang="es-es" href="https://www.example.com/es/" />
<link rel="alternate" hreflang="fr-fr" href="https://www.example.com/fr/"
Wrapping Up
As you can see, hreflang tags can be your friends or your worst enemies.
The solution is, as always, to use SEO strategies that don’t manipulate search engine algorithms.
Our experience working in SEO has taught us that as long as you bring value and relevance to your clients, everything will work out great.
So, if you’re having difficulties and need a helping hand, let’s schedule a free strategy call.
We’ll discuss your SEO-related issues and find the most effective solutions together.