Do blogs suffer from duplicate content penalties in major search engines? A few days ago, this thought struck me as I was beginning to rework the look and feel of this blog - with posts showing up in as many as four different locations on the blog, was there any reason to think Google, Yahoo!, MSN, and other search engines may actually be penalizing my blog? The thoughts were further brought to the front when Barry Schwartz made a post on Search Engine Roundtable pointing to a thread on WebmasterWorld.com
Let’s consider the Wordpress blogging platform for now, although this should hold true through any other blogging platforms, and indeed, a variety of Content Management Systems. Looking at one of my previous posts, a list of SEO resources, you will find it at a variety of locations:
- It’s Permalink page: http://www.infohatter.com/blog/creating-a-list-of-seo-resources/
- Second page of the “Front Page’: http://www.infohatter.com/blog/page/2/
- Category for Advertising: http://www.infohatter.com/blog/category/advertising/
- Category for Link Building: http://www.infohatter.com/blog/category/link-building/
- Category for SEO: http://www.infohatter.com/blog/category/seo/
- More categories…
- Archives for August 2006: http://www.infohatter.com/blog/2006/08/
So as you can see, this text is replicated fully in as many as 10 different locations on my blog. What we are looking at here is a conflict between user-friendliness, and search-friendliness. This is an ideal setup by accessibility standards - the more ways you provide to access a piece of information, the more user friendly it is. But does this affect indexing and ranking in the search engines for this post?
When I perform a site: search in Google for the term ‘SEO resources’, I see that the single post page is the first result shown. So this is good - it means that when limited only to my site, Google is ranking the single-post page at the top, which is what I want to see. But are the rankings affected when the somebody searches the entire index of Google? Do the other copies of this post on this blog perhaps cause it to show lower than it would if it was only available in one place?
I think it does. In the past, Google has typically penalized dupe content hard, often sending sites to the supplemental index for such an offense. So, are all bloggers getting the same type of penalization?
What can be done about this? One of the first solutions to jump out at me would be to include <meta content="noindex,follow" name="robots" /> on the category and archive pages; this would ensure that only the front page and the single post pages are indexed.
I would appreciate any thought or comments on this matter - this is something that should concern all bloggers. Perhaps this may be having a very noticeable impact on readership levels? Either way, it is something that could bear some serious thought.
Digg This Post!
Share This