What is duplicate content?
Duplicate content is the pinnacle of all evil to us search marketers, a website owner will come to us full of enthusiasm about their future online SEO campaign only to find out that their website is plagued with duplicate copy. You usually find that this is more common with eCommerce stores on a low budget where they do not have the time or staff to write descriptions for 200+ products.
So what usually happens? The first thing store owners do is go to the manufacturers’ or a competitors’ website and copy the content from there and use it on their own website (does it sound familiar?). I always advise clients to use unique copy with writing descriptions of products, and I know it is a pain in the rear, I have been there myself, but if people really want their website to succeed, then they really need to get creative with their writing. I promise you that your hard work will pay off, but you need to be patient.
Denial is not just a river in Egypt…
I have had clients in the past that have denied it, but after we show them the evidence of their sneaky deeds, they eventually and shamefully admit to it. This is followed up with a ‘don’t worry, it is more common than you think’ comment from myself, which usually results in an enormous sigh of relief. The duplicate content filter is easy to get out of because it IS NOT a penalty, it is simply a filter. Once Google revisits the page, their bot will notice that the content has changed and the page should rank in its true position depending on how good that piece of writing is.
TIP – always revisit landing pages that don’t get very many hits by viewing under-performing pages in Google Analytics and Google Webmaster Tools – these are going to be two major tools in your SEO services arsenal. View the past two months’ Analytics data for your landing pages in reverse order of sessions and compare those with your average rank in Webmaster Tools. Your page may be under-performing, so you have to ask yourself ‘why?’. You can then try different techniques with your content and meta tags, or even try interlinking similar products within the content, such as introducing semantics.
Few people out there have even heard of duplicate content, so how do they know how bad it is affecting their website? Do they even know that the search engines look at duplicate content differently than unique content? Duplicate copy is usually filtered from the results if your page is not the original cached version, therefore if you have an eCommerce store and every product description is duplicated from elsewhere, your website or landing page will be under-performing.
It’s all about the long-tail keywords – honestly
To run a successful eCommerce store, you really need to squeeze every bit out of your product and category landing pages. Detailed descriptions are a start, and somehow you may even want to use a review plugin so your customers can write their own reviews (thus doing your content writing for you). Either way, it has to be unique.
If you look to Latent Semantic Indexing and really go to work on your content, then you will be rewarded with long-tail keywords. Now, in my experience, these convert much better than the bigger and broader keywords as they are far more descriptive of what the person is looking for. Somebody looking for bedding will be undecided, whereas somebody looking for ‘Mickey Mouse Bedding’ knows what it is they are looking for because their four-year-old child, niece or nephew is a big Mickey Mouse fan. So they should then proceed to the Mickey Mouse bedding category or landing page and turn it into a quality landing page full of great content.
Little by little…
So, you may be here because you have jumped onto the copy & paste bandwagon and now you don’t know which pages are unique and which are duplicates. Well, there is an easy way of dealing with that, it is called Copyscape. Copyscape is an online tool that has been around for quite a while and will be familiar to anyone who has been in Internet marketing for a while, and it is a tool that we use to check for plagiarism (the dark art of copying others’ written work for the benefit of your own website – naughty, naughty).
Copyscape is actually free but you can only check one page at a time, and for an eCommerce website that has 200+ products than checking each description one by one can become a tedious task that will test the mental stamina of even the greatest of content writers. However, if you sign up for the pro version, then you are going to be paying just 5 pence per search, so 200 product pages will cost you just £10, a small fee that is worth paying so you can get your online business up and running.
But how does it all really work?
Once you are logged in to your Copyscape account, then you can add some credits via Paypal. Once you have filled up your credits, you can run the batch analysis tool. This will allow you to put in all of your product page URLs and check them all at once, rather than one at a time. There are many ways you can do this, but here are a few.
First, you need to change the settings to how you browse. Google currently loads ten results per search query by default, but you can change that to 100 by clicking on the little cog icon in the top right-hand corner and clicking on ‘Search Settings’.
Now that you are on this page, you will see a little further down the ‘Results Per Page’ section. Check the box that says ‘Never show instant results and the scroll bar underneath should change colour from greyed out to black. Drag the bar over to 100 and click on save. So now you can see 100 results per page instead of just ten, and you can change this back later.
I am assuming you are using Google Chrome for this (as I am an IE hater) – you can install a plugin called SEOQuake – alternatively, you can use a scraping tool such as Scrapebox but I will move on to that another time, this is just one way of many that you can use. Once you have SEOQuake installed, then you need to change a lot of parameters. We turn off everything except Google Pagerank, and there is a reason for this which I will explain further down. Here is what my parameter setup looks like.
Also, make sure that you disable automatic querying of all parameters, otherwise, every time you make a Google search, the toolbar will query the PageRank of each page that loads. We are going to be setting our results to 100 per page so if Google sees you are automatically querying 100 websites PageRank then you are likely to get your IP address banned for a few hours (you will know you have done this if you see captchas popping up).
Now, with your SEOQuake enabled, you can now run a search for all the pages that are cached in Google’s search index. How do you do it, you may ask? Well, you are typing the search parameter site:www.yourwebsite.com into Google and clicking on search. Do you now see all of your websites? Now you have to export them…
Another excellent addition to the SEOQuake plugin is a CSV exporter that will export all 100 results into a CSV file, but with the way we have set up the parameters, there is an easier way to export results. So, run your site:www.yourwebsite.com into Google and then click on the little icon at the top of the page that says ‘Show as CSV’.
A new window will open with each of your pages that may look something like this:
Copy & Paste all of them into Notepad and you are going to find and replace “;”?” with absolutely nothing, find and replace the beginning comma with absolutely nothing, and also find and replace Url; Google PageRank with nothing, and this should cause you to have a long list of URLs from your website. There may be pages there that are not worth running through Copyscape so just slowly go through your list and remove the ones you do not need to check.
Now it is time to log in to your Copyscape account and run the batch analysis tool. The more URLs you have, the longer it will take to run, but it rarely takes too long at all. Once the batch checker is complete, then you will receive an email letting you know.
The really significant thing about Copyscape is that it keeps an archive of all of your batch runs, so you don’t lose your data if you run another batch of URLs on another website.
Once you receive your email, then simply click on the link and it will take you to your results.