We’ve stated it method again when, however we’ll repeat it: it retains superb us that there are nonetheless folks utilizing only a robots.txt information to forestall indexing of their website in Google or Bing. In consequence, their website exhibits up in the major search engines anyway. Are you aware why it retains superb us? As a result of robots.txt doesn’t really do the latter, despite the fact that it does forestall indexing of your website. Let me clarify how this works on this publish.
For extra on robots.txt, please learn robots.txt: the ultimate guide. Or, discover the best practices for handling robots.txt in WordPress.
There’s a distinction between being listed and being listed in Google
Earlier than we clarify issues any additional, we have to go over some phrases right here first:
- Listed / Indexing
The method of downloading a website or a web page’s content material to the server of the search engine, thereby including it to its “index.”
- Rating / Itemizing / Displaying
Displaying a website within the search outcome pages (aka SERPs).
So, whereas the most typical course of goes from Indexing to Itemizing, a website doesn’t must be listed to be listed. If a hyperlink factors to a web page, area, or wherever, Google follows that hyperlink. If the robots.txt on that area prevents indexing of that web page by a search engine, it’ll nonetheless present the URL within the outcomes if it may well collect from different variables that it is perhaps value taking a look at.
Within the previous days, that would have been DMOZ or the Yahoo listing, however I can think about Google utilizing, as an illustration, your My Enterprise particulars nowadays or the previous knowledge from these tasks. Extra websites summarize your web site, proper.
Now if the reason above doesn’t make sense, take a look at this video clarification by ex-Googler Matt Cutts from 2009:
When you have causes to forestall your web site’s indexing, including that request to the precise web page you wish to block like Matt is speaking about, remains to be the proper solution to go.
However you’ll want to tell Google about that meta robots tag. So, if you wish to cover pages from the major search engines successfully, you want them to index these pages. Despite the fact that that may appear contradictory. There are two methods of doing that.
Forestall itemizing of your web page by including a meta robots tag
The primary choice to forestall the itemizing of your web page is through the use of robots meta tags. We’ve received an final information on robots meta tags which is extra in depth, nevertheless it principally comes all the way down to including this tag to your web page:
<meta title="robots" content material="noindex,nofollow">
Should you use Yoast search engine marketing, that is tremendous straightforward! No want so as to add the code your self. Study how to add a noindex tag with Yoast SEO here.
The problem with a tag like that although, is that you must add it to each web page.
To make the method of including the meta robots tag to each single web page of your website a bit simpler, the major search engines got here up with the X-Robots-Tag HTTP header. This lets you specify an HTTP header referred to as
X-Robots-Tag and set the worth as you’d the meta robots tags worth. The cool factor about that is that you are able to do it for a whole website. In case your website is working on Apache, and mod_headers is enabled (it normally is), you may add the next single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this might have the impact that that complete website can be listed. However would by no means be proven within the search outcomes.
So, do away with that robots.txt file with
Disallow: / in it. Use the X-Robots-Tag or that meta robots tag as a substitute!
Learn extra: The ultimate guide to the meta robots tag »