
We’re excited to hear your project.
Let’s collaborate!
Accidentally creating duplicate content in Drupal is like... a cold:
Catching it is as easy as falling off a log.
All it takes is to:
So, what are the “lifebelts” or prevention tools that Drupal “arms” you with for handling this thorny issue?
Here are the 4 modules to use for boosting your site's immunity system against duplicate content.
And for getting it fixed, once the harm has already been made:
Let's get down to the nitty-gritty of how Drupal 8 duplicate content “infiltrates” into your website.
But first, here are the 2 major categories that these sources fall into:
The first ones include all those scenarios where spammers post content from your website without your consent.
The non-malicious duplicate content can come from:
Also, duplicate content in Drupal can be either:
And since it comes in “many stripes and colors”, here are the 7 most common types of duplicate content:
Has someone copied content from your website and further published it? Do not expect Google to distinguish the copy from its source.
That said, it's your job and yours only to stay diligent and protect the content on your Drupal site from scrapers.
Are there 2 identical version of your Drupal website available? A www and a non-www one?
Now, that's enough to ring Google's “duplicate content in Drupal” alarm.
So, you've painstakingly put together a list of article submission sites to give your valuable content (blog post, video, article etc.) more exposure.
And now what? Should you just cancel promoting it?
Not at all! Widely syndicated content risks to get on Google's “Drupal 8 duplicate content” radar only if you set no guidelines for those third-party websites.
That is when these publishers don't place any canonical tags in your submitted content pointing out to its original source.
What happens when you overlook such a content syndication agreement? You leave it entirely to Google to track down the source. To scan through all those websites and blogs that your piece of content gets republished on.
And often times it fails to tell the original from its copy.
This is probably one of the sources of duplicate content in Drupal that seems most... harmless to you, right?
And yet, for search engines multiple printer-friendly versions of the same content translates as: duplicate pages.
Have you made the switch from HTTP to HTTPs?
Entirely?
Or are there:
Make sure you detect all these less obvious sources of identical URLs on your Drupal website.
Your site's vulnerable to this type of duplicate content “threat” particularly if it's an e-commerce one.
Just think of all those too common scenarios where you display highly similar product descriptions on several different pages on your eStore.
Users themselves can non-deliberately generate duplicate content on your Drupal site.
How? They might have different session IDs that generate new and new URLs.
What are the tools that Drupal puts at your disposal to detect and eliminate all duplicate content?
Imagine all the functionality of the former Global Redirect module (Drupal 7) “injected” into this Drupal 8 module!
In fact, you can still define your Global Redirect features by just:
Image Source: WEBWASH.net
What this SEO-friendly module does is provide you with a user-friendly interface for managing your URL path redirects:
Summing up: when it comes to handling duplicate content in Drupal, this module helps you redirect all your URLs to the new paths that you will have set up.
This way, you avoid the risk of having the very same content displayed on multiple URL paths.
How about “fighting” duplicate content on your website at a vocabulary level?
In this respect, this Drupal 8 module:
Just admit it now:
How much do you hate the /node125 type of URL path aliases?
They're anything but user-friendly.
And this is precisely the role that Pathauto's been invested with:
To automatically generate content friendly path aliases (e.g. /blog/my-node-title) for a whole variety of content.
Let's say that you want to modify the current “path scheme” on your website with no impact on the URLs (you don't want the change to affect user's bookmarks or to “intrigue” the search engines).
The Pathauto module will automatically redirect those URLs to the new paths using any HTTP redirect status.
Personalization is key when you strive to prevent duplicate content in Drupal, right?
And this is precisely what this module here does: it helps you personalize content on your website.
How? Through its 3 main functionalities delivered to you as sub-modules:
Leveraging Natural Language Processing, this last sub-module scans content on your website and alerts you of any signs of duplicity detected.
Word of caution: keep in mind that the module is not yet covered by Drupal's security advisory policy!
Setting a goal to ensure 100% unique content on your website is as realistic as... learning a new language in a week.
Instead, you should consider setting up a solid strategy ”fueled” by (at least) these 4 modules “exposed” here. One that would help you avoid specific scenarios where entire pages or clusters of pages get duplicated.
Now, that's a far less utopian goal to set, don't you think?
We’re excited to hear your project.
Let’s collaborate!