Here’s the seventh in the series of videos posted by Google’s Matt Cutts to Google Video over the past year. These are important for every web developer to see. Please see Matt’s first!

See the rest of the videos!

Transcription

Ah. Well, Hello There!
I was just enjoying some delicious ‘Diet Sprite Zero’ , while reading my new issue of ‘Wired’ magazine. Oh, they really captured the asymmetry in Steven Colbert’s ears. Didn’t they?

I don’t know. I think it will be really fun to do fake commercials. Diet Sprite has not paid me anything for endorsing them.

Alright. Shawn Stinez (??) writes in.

“Does Google Analytics play part in SERPs?”. SERPs meaning, Search Engine Results Pages.

To the best of my knowledge, it does not. I am not going to categorically say we don’t use it any where in Google. But, I was asked this question in Webmaster World in Las Vegas last year, and I pledged that Webspam team will not use Google analytics data at all. Now, webspam is just a part of quality and quality is just a part of Google, but Webspam definitely has not used Analytics data to the best of my knowledge. Other places in Google don’t either. Because we want people to just feel comfortable using it and (pause) use it.

Alright. Gwen writes in. She or he says:

“Dear Mr.Cutts, its going to be along weekend, You get a lot of questions asked.” Thank you ma’m very sympathetic of you! “But I have to. When does Google detect duplicate content and within which range will duplicate be duplicate?”.

Good question.

So, that’s not a simple answer. The short answer is, we do a lot of duplicate content detection. It’s not like there is one stage where we say, right here is we detect the duplicates. Rather, it’s all the way from crawl, through the indexing, through the scoring, all the way down until finally just milliseconds before you answer things. And there are different types of duplicate content. There is certainly exact duplicate detection, so, if one page looks exactly same as another page, that could be quite helpful. But at the same time, its not case the pages are not always exactly the same. And so, we do also detect near duplicates. We are using a lot of sophisticated logic to do that. So, in general, if you think you might be having problems, your best guess is probably is to make sure that your pages are quite different from each other. Because we do, do a lot of duplicate detection to crawl less and to provide better results and more diversity.

OK. Jeff Jones(??) writes in. This is my favorite question. Well, there have been a lot of good questions. I really like this one.

“I would like to explicitly exclude a few of my sites from the default moderate safe search filtering. Google seems to be less of a prude than I would like to prefer. Is there any hope of a tag, attribute or other snippet to limit a page to unfiltered results or should I just start putting a few nasty words in the alt tags of blank images.”

Well, don’t do them in blank images. You know, put them in meta-tags. Whenever I was writing the very first version of safe search, I noticed that there were a lot of pages which did not tag their sites or their pages at all, in terms of we are being adult in content. So there are lot of industry groups,there is a lot of industry standards, but at that time, the vast majority of porn pages just sort of ignored these tags. So, its not that big deal, go ahead and include that.

So a short answer to your question is, to the best of my knowledge there is no tag that can just say, “I am porn, please exclude me from your safe search.” Its wonderful that you are asking about that.

Your best bet, I would go with meta-tags. Because safe search, unlike a lot of different stuff, actually does look at the raw content of a page, or at least the version that I last saw looks at the raw content of the page. And so, if you put it in your meta-tags or even in comments, which is something that isn’t usually is not indexed by Google at all, we should be able to detect that it is porn that way. Don’t use blank images. Don’t use images that people can’t see though.

And then lets finish of with a question from Andre Shogan (??). He says:

”Sometimes I make a box spiderable, by just putting links in the option elements, normal browsers ignore them and spiders ignore the option. But since Google is using the Mozilla bot, and the bot renders the page before it crawls it, I know that if the Mozilla engine renders the element who will remove the element from the document object model tree.”

So in essence he is saying, can I put the element in an option box. You can. But I wouldn’t recommend it. it is pretty non-standard behavior. Its very rare.It would definitely make my eyebrows go up, if I were to see it, so its better for your users and better for search engines, if you probably just take those links out, put them somewhere at the bottom of the page or in a sitemap, and then that way, we will be able to crawl right through and we don’t have to have hyperlinks or anything like that.

Alright! that’s enough questions for now. Its getting toward eleven o’ clock. I am going to call it night.
Its Sunday, July 30th. So we will see if we can knock a few of these out next week. Thanks a lot.

Transcription thanks to Peter T. Davis

No comments yet. Be the first.

Leave a reply