Preventing indexing of specific contents within a page
I have always wondered why this feature wasn’t developed before.
Yahoo brings us the robots-nocontent tag that literally allows us to prevent indexing of specific content within our webpage that for some reason we don’t want indexed.
The first thing that came to my mind when I first heard of this was – Duplicate content.
This seems to be the perfect solution to avoid the unnecessary duplication of content on WebPages and subsequently on the Search results.
Quoting should be about bringing value to the content that is being quoted, not the other way around. This new Tag does just that, it doesn’t remove the whole intend behind quoting, but at the same time doesn’t remove the ownership from the original content providers.
Step Forward towards better search results
Most importantly, if used properly, this new features improves the relevancy of the search results.
I don’t want nor do I need to find the same content regurgitated over and over again on the search results and WebPages. Using this tag should be instated has a good practice just as the no-follow usage was initially was intended to be used.
How does it work?
Very simple. Just assign a class attribute – robots-nocontent within a div, span or paragraph tag. It will affect the entire contents within that tag just like any other class attribute would.
On a final note, on Danny Sullivan’s Daily Search Cast I heard that even Matt Cutts praised this feature, which might indicate that we might see Google using it in the very near future.
Reference: Yahoo Supports New Robots-Nocontent Tag To Block Indexing Within A Page
Tags: robots-nocontent, Yahoo
You can leave a response, or trackback from your own site.







My personal opinion is that Yahoo did not think this through. Using a Class attribute to relay information to a robot is wrong — semantically and technically.
While the microformat for “nofollow” is viable as a Webmaster/Blogger tool, there needs to be Standard without using classes, attributes and other confusing (oft misleading) forms of communication.
My proposal is a natural progession into the Body of the Html — a <robots attr=”value”> tag. And instead of marking up what is not “content”, simply markup what is. ‘The’ content of the page normally starts with an <H1> tag and proceeds from there … does it not?
But wouldn’t that be a bit overwhelming since maybe 98% of the page is content that we want indexed?
Webstractions, just checked out your blog and read your post, and I must admit, you raise a hell of a good point.
But wouldn’t that be a bit overwhelming since maybe 98% of the page is content that we want indexed?
Don’t confuse indexing of page content with following links. Links will always be followed unless you use the rel=”nofollow” extension.
It is easier to identify content than the other way around. As an example, the content on this page starts with the Title of the post and ends right before the section that I am typing this reply in. The rest of your page is non-content, correct?
If you have links in the sidebar that may be related to this post, then you would wrap that with a tag to identify it as related content.
My reasoning is that it is easier to implement (and understand).
I disagree. I just think thart the reverse logic seems more practical.
Hi there been surfing the net for Search Engine Optimization Conference and found your blog reg ing indexing of specific contents within a page at Pedro Sttau. You relly know your stuff! I\’d like to see more posts here. Will definitely bookmark this one and come back.