Very interesting take on why Yahoo shouldn't be using the robots-nocontent tag

Just read a great post by webstractions that I would like to share with you guys. He comes up with a very interesting concept that seems to provide much more logical approach to the new indexing filtering method that Yahoo is deploying. WebStractions presented several alternatives. One of them is the creation of a new tag such has that I presume would wrap around the content that you wish not to get indexed. However, the most interesting idea was the usage of the old REL attribute (Relation) to dictate the purpose and origin of a given content. Could agree more with WebStractions, if it's not original content, it quite simply shouldn't be indexed as if it were. I still that Yahoo is leading the way, and it’s much better having this tool then nothing to work with, using a CSS class to filter our content is a rather dubious decision., would certainly be very interested to know the reasons behind this choice. WebStractions Yahoo's Robots-NoContent Another shade of NoFollow

Root Page dropped from Google Results - How I solved the problem

Two days ago I was making my regular internal and client SERPS check and my company's page was no where to be found. I pretty much had to perform the search 10 times to believe my eyes. Now, we averaged at about SERP #3 and #5 for the term "Web Marketing" on Google.pt, as for "Web Design" we tend to rank#11.

Upon further research, I realized that my internal pages were showing up as usual, maintaining a steady rank for the usual keywords that they were ranking for.

By searching for the company name itself (easylogics), I realized that the root page of my domain had simply been removed from the index, leaving the first result as the https://www.easylogics.com

By now, everything was going through my mind. Could it have been too many inbounds at a short period of time? (Although this was very unlikely as I was very cautious not to link build too aggressively) Although I do tend to venture off to rather gray waters at times, I do not subscribe to any SEO Black hat techniques whatsoever.

1. No hidden text on any of the pages

2. No Keyword dumping in the ATL tags or anywhere else in the pages

3. No Duplicate content – Even the print version of the site used nofollow

4. No Reciprocal links

5. No purchased links. 

6. No sneaky javascript redirects or anaythung remotely similar

7. No unrelated outbound links Basically no misdoing was going on.

So I began looking for answers somewhere else. Why were our first results replaced with the exact same page but with the HTTPS? I believe that Goole somewhat favours HTTPS pages in detriment to regular pages. I haven’t read or had any proof of this beyond what happened to me, but it does make sense.

An HTTPS page is a secure page that not only provides a safe way for the user to interact with the web platform, but it usually also assures a bigger degree of veracity regarding to the website itself. Disclaimer Now, in all honesty it is impossible to assure that any of the actions that I took had any influence at all in the resolution of my problem. Nevertheless, it was quite a coincidence that everything went back to normal after the last time crawl seeing has it had been crawled 3 times before I made the changes though any improvement in my situation.

1. Prevent Google from indexing your HTTPS pages One of the other issues I was facing was a potential Duplicate content problem caused by a mistake of my doing. If you have HTTPS enabled, it might be a good idea to mess around with your Robots.txt file and disallow Google from crawling these pages. Additionally a no-follow should be implemented on the Meta-tags. http://en.wikipedia.org/wiki/Nofollow Example of a Robots.txt disallowing HTTPS indexing User-agent: * Disallow:/https:/ Also, if you use Sitemaps, remember to remove the HTTPS links from the XML file. Remember, and HTTPS version is treated as an independent page, even tough the content is the same as in the regular HTTPS version. Google FAQ - HTTP V.s HTTPS

2. Chose your preferred URL format Making life easer for the Google Bot may actually help you. Now please notice that this is highly speculative, and there might not be a direct correlation between this and the problem it self, however, I find it logical that the easier you make your site to crawl the better it get’s crawled! Login to your Web Master Central account, and on the diagnosis tab select “Preferred domain”, chose the domain format that you would like Google to display on your Results page.

3. Resubmit your Sitemap Even if no changes were done to the XML file, resubmit it. Try to be patient and wait Easier said that done, I know! Just try and wait out for 24 hours. My guess is, if within these 24 hours your site was crawled and nothing changed, none of the above had any effect.

About Searchmash

I was very surprised to find out how obscure Search Mash is even in the inner circlesof the Search community. For those who don't know, Search Mash is a Search Engine owned by Google that is used as a platform to test new features and ideas. One might even say that some elements seen on the engine are a preview of what's to come. Search Mash is ahead of the engine seen on Google.com e almost every technical aspect.

The Query speed in infinitely faster, the navigational system much slicker and easier to use. Even the Search results seem to be more relevant, which means that the Search Mash seems to be tampering with the Algorithm itself as well. I especially love the Search box's fixed position on top, making it a lot easier to repeat a search.

Besides the right bar gadgets, the most visible and noticeable feature is the absence of pagination within the Search Results page. By taking advantage of Ajax, the Search Engine expands the retuls beneath each other. I would love to see Google test out the new Adsytem within this platform! Be sure to check it out: visit Search Mash

Preventing indexing of specific contents within a page

I have always wondered why this feature wasn’t developed before. Yahoo brings us the robots-nocontent tag that literally allows us to prevent indexing of specific content within our webpage that for some reason we don’t want indexed. The first thing that came to my mind when I first heard of this was – Duplicate content.

This seems to be the perfect solution to avoid the unnecessary duplication of content on WebPages and subsequently on the Search results. Quoting should be about bringing value to the content that is being quoted, not the other way around. This new Tag does just that, it doesn’t remove the whole intend behind quoting, but at the same time doesn’t remove the ownership from the original content providers.

Step Forward towards better search results

Most importantly, if used properly, this new features improves the relevancy of the search results. I don’t want nor do I need to find the same content regurgitated over and over again on the search results and WebPages. Using this tag should be instated has a good practice just as the no-follow usage was initially was intended to be used.

How does it work?

Very simple. Just assign a class attribute - robots-nocontent within a div, span or paragraph tag. It will affect the entire contents within that tag just like any other class attribute would.

Yahoo will not index this content
On a final note, on Danny Sullivan’s Daily Search Cast I heard that even Matt Cutts praised this feature, which might indicate that we might see Google using it in the very near future. Reference: Yahoo Supports New Robots-Nocontent Tag To Block Indexing Within A Page