Electronics reviews = spam

Google, Spam - No Comments » - Posted on November, 19 at 3:16 pm

Google’s search results for product reviews truly suck hard, particularly in the electronics vertical. To be fair, so do Yahoo’s.

Recently I’ve bought a laptop, a new PC (custom built), and a new TV, and like many customers I like to read impartial expert reviews of products before I buy them. For cheaper products I don’t mind using the customer reviews on places like Amazon, but when I’m spending upwards of £100, I usually like to get an expert’s opinion.

For a long time now this has been all but impossible to do, because both Google & Yahoo’s search results have been flooded with crap from sites such as reviewcentre.com. Often these results are simply holding pages “awaiting” a review - an example of this is shown below:

These pages rank because they’re on domains with a high level of authority, and there is often some unique content such as a price comparison feature - of little value to a user looking for reviews.

Then there are the retailers/affiliates who target the keyword [reviews] in order to benefit from the (likely highly converting) search traffic in this area. Often there are no reviews on these sites either.

Increasingly it seems that the only way to get legitimate reviews is to type in the model #, and then some random feature of the product or some superlative such as ‘good’, ‘great’, ’superb’, etc.

Will someone please build a site that aggregates only expert reviews of a range of products? Please? :)

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google

Yet more duplicate content fuzz

Google, SEO - No Comments » - Posted on September, 16 at 2:17 pm

Read this post from Google on duplicate content, then read (and pay more attention to) Graywolf’s post.

I was going to blog about this whole thing, but Graywolf has summarized it well.

All I’d add is that Google are playing a dangerous game by pretending that they can handle duplicate content. What Google are effectively saying here is that people with good intent don’t need to worry about duplication, because Google’s algorithm can handle it. Maybe on a macro scale it can, but on a per-website level, and for the average webmaster, it really can’t.

I don’t buy their line that duplicate content penalties are a “myth” (as I’ve seen a direct 3 month duplicate content penalty on a large site), but I do think that there are filters applied that can effectively kill a website’s ranking for competitive terms, which can amount to the same thing.

Although

Don’t create multiple pages, subdomains, or domains with substantially duplicate content.

That point is largely ignored for the rest of the post. But it is the most important point - the best way to avoid duplicate penalties, filters, whatever you want to call it, is to create a good information architecture from the outset. This doesn’t just help the search engines (even if Google appear to down play it in this post), it helps your users.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google

Has Web 2.0 finally jumped the shark?

web 2.0 - No Comments » - Posted on September, 11 at 10:04 am

A social network for dead people in the Techcrunch 50?!??!!

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google

Is your web analytics solution causing duplicate content?

Analytics, SEO - 2 Comments » - Posted on September, 10 at 6:39 pm

Duplicate content is a pervasive problem for a lot of websites, and can be difficult to understand for the average webmaster. Recently, I’ve come across problems caused by a couple of big name web analytics packages.

The issue stems from the use of various URL parameters to track a visitor’s session on a website. This is slightly different to the old session ID parameter problem, in that it doesn’t create quite as many duplicate pages; but it is important nonetheless.

I’ll work through a couple of examples of this; firstly I’ll look at Omniture’s SiteCatalyst product, using Channel 4’s website as a guinea pig. SiteCatalyst’s tracking code appears in a URL as intcmp, which stands for ‘Internal Campaign’, and can be identified quickly in the SERPs using a search like this:

Search results page for Channel 4

As you can see from the snapshot above, there are nearly 10,000 results with this parameter added to the URL. If you visit any of these URLs you can delete the ‘?intcmp=blah’ section and get exactly the same page. Internal campaigns come and go, and as the campaign disappears, the link to the campaign disappears, but leaves behind an extra URL that the content can be accessed from.

A nice example for Channel 4 is this Internal campaign URL:

  • http://www.channel4.com/bigbrother/?intcmp=homepage_flash (PageRank 3)

Clearly this link was indexed through a Flash based link on the home page. When this link disappears from the home page, it leaves this URL behind, sapping authority and causing duplicate content from the original page:

  • http://www.channel4.com/bigbrother/ (PageRank 6)

Now moving on to the second culprit, WebTrends, there is a very similar issue here. This time the nasty campaign parameter is WT.[something] - this is broken down in the following image:

This time the chosen victim is a site called LuxuryLink - let’s take a look at the SERP:

LuxuryLink search results

Again, a couple of examples from this - the tracked version:

  • http://www.luxurylink.com/LL/home_win_trip.php?WT.ac=MOENON03 (PageRank 5)

And the untracked:

  • http://www.luxurylink.com/LL/home_win_trip.php (PageRank 4)

So this time the tracked URL has actually gathered more authority than the non-tracked version, but still the page’s authority is being split and duplicate content is created again.

How to fix it

Now this is not necessarily the fault of the web analytics companies, and definitely not the webmasters. There are many reasons why URL parameter tracking needs to be employed, especially in large companies where the split between the marketing team and the web development team is too pronounced to be able to implement HTML code changes with ease.

However the ideal solution is generally to use a combination of on-page JavaScript and browser session cookies in order to track visitors to your site. This doesn’t interfere with your URLs - remember, cool URIs don’t change!

If that is not possible for whatever reason (or if it is and you have URLs with the parameters above indexed in Google), use your robots.txt file, and use it wisely. I’d always recommend using Google Webmaster Central’s robots.txt syntax checking tool before implementing any major changes to your robots.txt file, but these rules should clear out the duplicate URLs:

Omniture:

User-agent: *
Disallow: /*intcmp=

WebTrends:

User-agent: *
Disallow: /*wt.*

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google

Over-powerful domain authority in Google

Google, Spam - No Comments » - Posted on September, 10 at 2:11 pm

Some interesting results for the query [airline tickets] in Google today - at numbers 3 & 4 there are 2 results from Bebo, the well-known airline ticket supplier.

Hang on, Bebo - isn’t that a social networking site rather than an airline ticketing company? As outlined by Trend Labs recently in a post entitled Spam Early Spam Often, fake Bebo profiles are being set up and used as a way to lure unsuspecting users into clicking on nefarious links within a profile.

This now appears to be affecting Google’s results too, at least for this query. This profile appears at #3 for the query:

Although these fake profiles have very low quality content and just link to an incredibly spammy domain, they have around 1,000 (spammy) backlinks according to Yahoo, and apparently that’s enough for Google to put them above the fold for a competitive query such as this.

Even though Bebo is a well known social networking site. Not anything to do with airline tickets. Simply being hosted on Bebo is enough to rank for this term.

Bring on the semantic web…

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google

Google tinfoil hat theory #1

Google - No Comments » - Posted on September, 4 at 8:29 pm

I’m making a new resolution to blog less about Google from now on - they already dominate the online conversation about just about everything apart from hardware, and really don’t need the extra exposure.

However I think there aren’t enough conspiracy theories about Google’s world domination (rofl) - so here’s another one:

Google releases facial recognition technology and its own browser in the same week in an extraordinary attempt to not only discover everyone’s various IP locations (home, work, mobile - erm Android), track their browsing habits, but also to know everyone’s faces! What a huge amount of granular marketing info!

Scary.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google

Firefox shortcuts with Ubiquity

Firefox, Mozilla - No Comments » - Posted on September, 1 at 2:41 pm

As Robert Scoble writes, it’s only for passionate internet users (at the moment anyway), but the Firefox plugin Ubiquity has awesome potential for changing the way we use the internet - watch this video to learn more.

Ubiquity is one of a number of tools currently knocking around that look into changing the way we interface and interact with the web. It aims to break down the barriers to mashups and allow the average user to, say embed maps in an email, by simply selecting some text, pressing CTRL-SPACE, and typing ‘map’:

This kind of tool has implications for all areas of web use and is well worth keeping an eye on. One day soon a tool similar to Ubiquity (if not Ubiquity itself), will emerge as a game-changer, and as web professionals we need to be considering where web use is headed; tools such as this (and the excellent work of Mozilla Labs in general) provide some good clues.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google