A very rather simple return towards the headsmacking series this week (as it is late here in London and I've been up my ordinary 40+ hours traveling). We have been noticing that a number of ınternet sites seeking to block bot access to pages on their domain happen to be using robots.txt to undertake so. When this is certainly a great practice,
windows 7 home premium x64, the queries we've been receiving present that there are some misunderstandings about what blocking GoogleYahoo!MSNother search bots with robots.txt does. Here is a speedy breakdown: Block with Robots.txt - do not attempt to go to the URL,
microsoft office Home And Business 2010 serial key, but come to feel complimentary to maintain it inside the index & display in the SERPs (see below if this confuses you) Block with Meta NoIndex - sense complimentary to visit, but don't put the URL while in the index or display inside the results Block by Nofollowing Links - not a smart move, as other followed links can still put them with the index (it truly is fine if you don't want to "waste juice" on the page,
windows 7 professional 32bit, but don't think it will hold bots away or prevent it from appearing inside SERPs) Here's a instant example of a page that's blocked via robots.txt but appears in Google's index: (note that this robots.txt is the same across about.com's other subdomains, too) You can see that about.com is clearly disallowing the librarynosearch folder. Yet,
microsoft office 2007 Pro Plus serial key, here is what happens when we search Google for URLs in that folder: Notice that Google has 2,760 pages from that "disallowed" directory. They haven't crawled these URLs, so they appear as mere address strings (no title, description, etc - since Google can't see the pages' content). Now think one step further - if you've got any amount of pages you're blocking from the search engines' eyes, those URLs can still accumulate links, accumulate juice and other query-independent ranking factors,
windows 7 professional 32 bit, but they have no way to "pass it along" since their own links out will never be seen. I'll illustrate the situation: There's two real takeaways here: Conserve link juice by using nofollow when linking to a URL that is robots.txt disallowed If you know that disallowed pages have acquired link juice (particularly from external links), consider using meta noindex, follow instead so they can pass their link juice on to places on your site that need it. Looking forward to seeing folks at SMX London tomorrow (and for Will and my big showdown on Tuesday, too)! p.s. Andy Beard covered this topic previously in a solid post - SEO Linking Gotchas Even the Pros Make.