Canonicalization

Canonicalization is sometimes referred to as standardization or normalization is a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form.

 

Put simply, the rel=canonical tag is a way to tell Google that one URL is equivalent to another URL, for search purposes. Typically, a URL (B) is a duplicate of URL (A), and the canonical tag points to (A). The following tag would appear on the page that generates URL (B), in the <head></head>:

Google’s support document on rel=canonical is actually pretty good. The subject of duplicate content is complex

 

For example, you shouldn’t include ‘duplicate’ pages within a sitemap. If a page can be reached by two different URLs, for example http://example.com and http://www.example.com (and they both resolve with a ‘200’ response), then only a single preferred canonical version should be included in the sitemap

 

A canonical tag (aka "rel canonical") is a way of telling search engines that a specific URL represents the master copy of a page. Using the canonical tag prevents problems caused by identical or "duplicate" content appearing on multiple URLs. Practically speaking, the canonical tag tells search engines which version of a URL you want to appear in search results.

 

Canonical tag is included on pages that are duplicates of the specified url.

 

 

 

Why does canonicalization matter?

Duplicate content is a complicated subject, but when search engines crawl many URLs with identical (or very similar) content, it can cause a number of SEO problems.

  1. First, if search crawlers have to wade through too much duplicate content, they may miss some of your unique content.

  2. Second, large-scale duplication may dilute your ranking ability.

  3. Finally, even if your content does rank, search engines may pick the wrong URL as the "original."

Using canonicalization helps you control your duplicate content.

 

Put the URL that is the master on the page that is a duplicate.

 

For example, search crawlers might be able to reach your homepage in all of the following ways:

To a human, all of these URLs represent a single page. To a search crawler, though, every single one of these URLs is a unique "page."

 

 

Canonical tag best practices

Duplicate content issues can be extremely tricky, but here are a few important things to consider when using the canonical tag:

  1. Canonical tags can be self-referential

It’s ok if a canonical tag points to the current URL. In other words, if URLs X, Y, and Z are duplicates, and X is the canonical version, it’s ok to put the tag pointing to X on URL X. This may sound obvious, but it’s a common point of confusion.

 

Canonicals are typically used to link a non-canonical page to the canonical version, but they can also be used to link a page to itself. Self-referencing canonicals are beneficial because URLs may get linked to with parameters and UTM tags.

When that happens, Google may pick up the URL with parameters as the canonical version. So a self-referencing canonical lets you specify which URL you want to have recognized as the canonical URL.

Google recommends using self-referencing canonicals as a best practice, but they’re not required in order for Google to pick up on the correct version of a URL.

 

  1. Proactively canonicalize your home-page

Given that homepage duplicates are very common and that people may link to your homepage in many ways (which you can’t control), it’s usually a good idea to put a canonical tag on your homepage template to prevent unforeseen problems.

 

 

 

  1. Spot-check your dynamic canonical tags

Sometimes bad code causes a site to write a different canonical tag for every version of the URL (completely missing the entire point of the canonical tag). Make sure to spot-check your URLs, especially on e-commerce and CMS-driven sites.

 

  1. Avoid mixed signals

Search engines may avoid a canonical tag or interpret it incorrectly if you send mixed signals. In other words, don’t canonicalize page A -–> page B and then page B -–> page A. Likewise, don’t canonicalize page A -–> page B and then 301 redirect page B -–> page A. It’s also generally not a good idea to chain canonical tags (A-–>B, B-–>C, C–->D), if you can avoid it. Send clear signals, or you force search engines to make bad choices.

 

  1. Be careful canonicalizing near-duplicates

When most people think of canonicalization, they think of exact duplicates. It is possible to use the canonical tag on near-duplicates (pages with very similar content), but proceed with caution. There’s a lot of debate on this topic, but It’s generally ok to use canonical tags for very similar pages, such as a product page that only differs by currency, location, or some small product attribute. Keep in mind that the non-canonical versions of that page may not be eligible for ranking, and if the pages are too different, search engines may ignore the tag.

 

Can the noncanonical versions of a page still rank in search results?

 

  1. Canonicalize cross-domain duplicates

If you control both sites, you can use the canonical tag across domains. Let’s say you’re a publishing company that often publishes the same article across half a dozen sites. Using the canonical tag will focus your ranking power on just one site. Keep in mind that canonicalization will prevent the non-canonical sites from ranking, so make sure this use matches your business case.

 

Canonical tags vs. 301 redirects

One common SEO question is whether canonical tags pass link equity (PageRank, Authority, etc.) like 301 redirects. In most cases, they seem to, but this can be a dangerous question. Keep in mind that these two solutions create two very different results for search crawlers and site visitors.

If you 301 redirect Page A-->Page B, then human visitors will be taken to Page B automatically and never see Page A. If you rel-canonical Page A-->Page B, then search engines will know that Page B is canonical, but people will be able to visit both URLs. Make sure your solution matches the desired outcome.

 

How to Audit Your Canonical Tags for SEO

When auditing your canonical tags, there are a number of things worth checking for optimal SEO performance. Here's a checklist:

  • Does the page have a canonical tag?

  • Does the canonical point to the right page?

  • Are the pages crawlable and indexable?

A common mistake is to point the canonical at a URL that is either blocked by robots.txt, or is set to "noindex".  This can send mixed and confusing signals to search engines. A few common ways to inspect and audit your canonical tags are below.

  1. View-source

In most browsers, you can right-click to view-source, or simply type it into the address bar, like this: view-source:https://moz.com/learn/seo/canonicalization

In the source code, search for canonical tag in the <head>. If present, it should look like this:

 

 

 

 

  1. Use the MozBar

The MozBar is a free SEO toolbar that will easily show you the canonical tag on any given page. After installation, simply hit the Page Analysis tab, then click on "General Attributes" to view any canonical information.

 

  1. Audit in Bulk with Software Solutions

Most SEO site audit software allows you to audit canonical tags in bulk. Moz Pro checks for missing canonical tags, and can do so for 100s of thousand of pages at a time.

 

Common Questions Around Rel=Canonical:

(1) Should I Use Rel=Canonical for Pagination?

I’m not going to repeat all of Google’s answers, but this one is so frequently asked that it deserves more detail. Let’s say you have a series of paginated search results (1, 2, 3… n). These can be considered “thin”, from a search standpoint, so should you rel=canonical page n back to page 1?

Officially, the answer is “no” – Google does not recommend this. They recommend that you either rel=canonical to a “View All” page (if having all results on one page is viable) or that you use rel=prev/next. Rel=canonical can be used in conjunction with rel=prev/next to handle search sorts, filters, etc., but that gets complicated fast.

Pagination for SEO is a very tricky subject, and I recommend you check out these two resources:

 

 

 

 

Learn More:

  •