Duplicate Content is content on your site similar to the content on another site. This can happen naturally sometimes, but if you see an unusual number of pages with duplicate content or too much copied directly from another site you will probably want to get rid of it.
We all know that Google hates duplicate content, and it is one of the factors which determines how high you rank. You might have been using a duplicate content checker for years.
What is duplicate content?
Duplicate content is the pinnacle of all evil to us search marketers. A website owner will come to us full of enthusiasm about their future online SEO campaign only to discover that their website is plagued with duplicate copy. You usually find that this is more common with e-commerce stores on a low budget where they do not have the time or staff to write descriptions for 200+ products.
So what usually happens? The first thing store owners do is go to the manufacturers’ or a competitor’s website, copy the content from there, and use it on their website (does it sound familiar?). I always advise clients to use unique copy to write product descriptions, and I know it is a pain in the rear; We have been there ourselves, but if people want their website to succeed, they need to get creative with their writing. I promise you that your hard work will pay off, but you need to be patient.
Denial is not just a river in Egypt…
I have had clients who have denied it, but after we show them the evidence of their sneaky deeds, they eventually and shamefully admit to it. This is followed up with a ‘don’t worry, it is more common than you think’ comment from myself, which usually results in an enormous sigh of relief. The duplicate content filter is easy to get out of because it IS NOT a penalty. It is simply a filter. Once Google revisits the page, its bot will notice that the content has changed and the page should rank in its true position depending on how good that piece of writing is.
TIP – always revisit landing pages that don’t get very many hits by viewing under-performing pages in Google Analytics and Google Webmaster Tools – these are going to be two major tools in your SEO services arsenal. View the past two months’ Analytics data for your landing pages in reverse order of sessions and compare those with your average rank in Webmaster Tools. Your page may be under-performing, so you have to ask yourself ‘why?’. You can then try different techniques with your content and meta tags, or even try interlinking similar products within the content, such as introducing semantics.
Few people have even heard of duplicate content, so how do they know how bad it is affecting their website? Do they even know that search engines look at duplicate content differently than unique content? Duplicate copy is usually filtered from the results if your page is not the original cached version. Therefore, your website or landing page will be underperforming if you have an e-commerce store and every product description is duplicated from elsewhere.
It’s all about the long-tail keywords – honestly.
To run a successful e-commerce store, you must squeeze every bit out of your product and category landing pages. Detailed descriptions are a start, and you may even want to use a WordPress review plugin so your customers can write their reviews (thus doing your content writing for you). Either way, it has to be unique.
If you look to Latest Semantic Indexing and go to work on your SEO content, then you will be rewarded with long tail keywords. In my experience, these convert much better than the bigger and broader keywords as they are far more descriptive of the person looking for. Somebody looking for bedding will be undecided, whereas somebody looking for ‘Mickey Mouse Bedding’ knows what they are looking for because their four-year-old child, niece or nephew is a big Mickey Mouse fan. So they should then proceed to the Mickey Mouse bedding category or landing page and turn it into a quality landing page full of great content.
Little by little…
So, you may be here because you have jumped onto the copy & paste bandwagon and now you don’t know which pages are unique and which are duplicates. Well, there is an easy way of dealing with that. It is called Copyscape. Copyscape is an online tool which has been around for quite a while and will be familiar to anyone who has been in Internet marketing for a while. It is a tool we use to check for plagiarism (the dark art of copying others’ written work for your website’s benefit – naughty, naughty).
Copyscape is free, but you can only check one page at a time, and for an e-commerce website with 200+ products, then checking each description one by one can become a tedious task that will test the mental stamina of even the greatest of SEO friendly content writers. However, if you sign up for the pro version, you will be paying just 5 pence per search, so 200 product pages will cost you just £10, a small fee worth paying so you can get your online business up and running.
But how does it all work?
Once logged in to your Copyscape account, you can add some credits via Paypal. Once you have filled up your credits, you can run the batch analysis tool. This will allow you to put in all your product page URLs and check them all at once rather than one at a time. You can do this in many ways, but here are a few.
First, you need to change the settings to how you browse. Google currently loads ten results per search query by default, but you can change that to 100 by clicking on the little cog icon in the top right-hand corner and clicking on ‘Search Settings’.
Now that you are on this page, you will see further down the ‘Results Per Page’ section. Check the box that says ‘Never show instant results’ and the scroll bar underneath should change colour from greyed out to black. Drag the bar over to 100 and click on save. So now you can see 100 results per page instead of just ten, and you can change this back later.
I am assuming you are using Google Chrome for this (as I am an IE hater) – you can install a plugin called SEOQuake – alternatively, you can use a scraping tool such as Scrapebox, but I will move on that another time. This is just one way of many that you can use. Once you have SEOQuake installed, you need to change many parameters. We turn off everything except Google Pagerank, and there is a reason for this which I will explain further down. Here is what my parameter setup looks like.
Also, make sure that you disable automatic querying of all parameters. Otherwise, every time you make a Google search, the toolbar will query the PageRank of each page that loads. We will be setting our results to 100 per page, so if Google sees you are automatically querying 100 websites PageRank, then you will likely get your IP address banned for a few hours (you will know you have done this if you see captchas popping up).
Now, with your SEOQuake enabled, you can run a search for all the pages cached in Google’s search index. How do you do it, you may ask? You are typing the search parameter site:yourwebsite.com into Google and clicking on search. Do you now see all of your websites? Now you have to export them…
Another excellent addition to the SEOQuake plugin is a CSV exporter that will export all 100 results into a CSV file. Still, with how we have set up the parameters there is an easier way to export results. So, type your site:www.yourwebsite.com query into Google and then click on the little icon at the top of the page that says ‘Show as CSV’.
A new window will open with each of your pages that may look something like this:
Copy & Paste all of them into Notepad and you are going to find and replace “;”?” with absolutely nothing, find and replace the beginning comma with absolutely nothing, and also find and replace Url;Google pagerank with nothing, and this should cause you to have a long list of URLs from your website. There may be pages that are not worth running through copyscape so just slowly go through your list and remove the ones you do not need to check.
It is time to log in to your Copyscape account and run the batch analysis tool. The more URLs you have, the longer it will take to run, but it rarely takes too long. Once the batch checker is complete, you will receive an email informing you.
The significant thing about Copyscape is that it keeps an archive of all your batch runs, so you don’t lose your data if you run another batch of URLs on another website.
Once you receive your email, click on the link to take you to your results.