Duplicate content: what it is, why it happens, and how to fix it

Duplicate content is when the same (or very similar) content is accessible on multiple URLs.

It’s common. It’s not always “bad.” But it often creates an indexing mess: crawlers have to choose which URL is the canonical version, and your ranking signals get split.

Common duplicate content patterns

Here are the ones I see most:

www vs non-www

Both versions resolve and return 200:

https://example.com/page
https://www.example.com/page

Fix it with consistent redirects and a single canonical host.

HTTP vs HTTPS

If HTTP pages are crawlable, you’re asking for trouble. Force HTTPS with redirects, and make sure canonicals point to HTTPS.

Trailing slash and index pages

These variants can duplicate:

/about
/about/
/about/index.html

Pick one, redirect the rest.

URL parameters (filters, sessions, tracking)

This is a big one for ecommerce and large sites:

?utm_source=...
?session=...
?sort=price

Some parameters are harmless; others generate infinite URL space. Canonical, parameter handling, and careful internal linking help here.

Printer pages and copy pages

Printable versions, AMP remnants, and duplicated templates can create duplicates if they are indexable.

How Google handles duplicates

Most of the time, there is no manual “duplicate content penalty.”

Instead, Google tries to cluster duplicates and pick a representative URL (canonical). If you don’t help, Google will guess. Sometimes it guesses wrong.

Signs you’re dealing with duplicates:

wrong URL ranking for a query
many “Duplicate, Google chose different canonical” statuses in Search Console
index bloat (too many URLs indexed that look the same)

Practical fixes (choose based on root cause)

Use canonical tags

Canonical is a hint, not a guarantee. It works best when:

content is truly similar
internal links consistently point to the canonical URL
you don’t block crawling of the canonical target

Use 301 redirects for true duplicates

If URL A should never exist, redirect it. Redirects are stronger than canonicals when you want consolidation.

Fix internal linking

If your site internally links to duplicates, you reinforce the problem.

This is often the hidden culprit: menus, category filters, and pagination templates generate multiple forms of the same URL.

Clean up low-value pages

If you have many near-identical pages targeting the same intent, consider consolidating. One strong page usually beats five weak variations.

How to find duplicates fast

Start with a crawl that flags canonical inconsistencies, redirect chains, and indexability issues. The SEO Audit Tool is designed for that kind of sweep.

Link back to the glossary

One-line definition: Duplicate Content in the Glossary.

Q&A

Does Google penalize duplicate content?

Usually it's not a penalty; it's a selection problem. Google chooses one version to index and rank, which can reduce visibility for other versions.

What is the best fix for duplicates?

It depends on the cause. Common fixes are canonical tags, 301 redirects, parameter handling, and consolidating pages with overlapping intent.

Can hreflang pages be considered duplicate?

Different language versions are expected, but you should implement hreflang correctly and avoid serving identical content across locales.

Duplicate content: what it is, why it happens, and how to fix it

Common duplicate content patterns

www vs non-www

HTTP vs HTTPS

Trailing slash and index pages

URL parameters (filters, sessions, tracking)

Printer pages and copy pages

How Google handles duplicates

Practical fixes (choose based on root cause)

Use canonical tags

Use 301 redirects for true duplicates

Fix internal linking

Clean up low-value pages

How to find duplicates fast

Link back to the glossary

Q&A

Does Google penalize duplicate content?

What is the best fix for duplicates?

Can hreflang pages be considered duplicate?

Subscribe for Updates

Social

Common duplicate content patterns

www vs non-www

HTTP vs HTTPS

Trailing slash and index pages

URL parameters (filters, sessions, tracking)

Printer pages and copy pages

How Google handles duplicates

Practical fixes (choose based on root cause)

Use canonical tags

Use 301 redirects for true duplicates

Fix internal linking

Clean up low-value pages

How to find duplicates fast

Link back to the glossary

Related wiki terms

Q&A

Does Google penalize duplicate content?

What is the best fix for duplicates?

Can hreflang pages be considered duplicate?

Privacy & Cookies

Privacy & Cookies

gdpr.settings