Gary Illyes from Google stated it’s an “foolish concept” to re-block or re-disallow pages after you allowed them to crawl that web page to ensure that Google to see the noindex tags on that web page. I apologize for the title of this story, it was a tough one to put in writing.
This got here up when Gary posted his PSA saying “your periodic reminder that crawlers that obey robotstxt will not see a noindex directive on a web page if stated web page is disallowed for crawling.” Then Patrick Stox adopted up with a query he sees quite a bit the place he stated a “widespread suggestion in search engine optimization communities on this state of affairs is to unblock, let the pages be crawled, after which block them once more. To me, that does not make sense. You find yourself the place you began.”
Gary agreed with Patrick and stated “yeah I would not re-disallow the pages as soon as they had been crawled.” He added “that seems like a foolish concept.”
The entire thing confuses me to be trustworthy, I do not get the logic, I imply, I form of do, however what’s the goal? Is the top consequence all that totally different?
SEOs have set a web page to dam crawling by way of robots.txt, Google will not index it and will not deindex the web page. However for those who set a noindex tag on the web page, then quickly enable Googlebot to crawl that web page to permit Google choose up on that noindex tag, then after Google picks up on it, you then put the disallow directive again. Why? You’ll find yourself again on the identical state while you do that after which you need to do that yet again, simply to repeat the method.
Listed below are the tweets:
your periodic reminder that crawlers that obey robotstxt will not see a noindex directive on a web page if stated web page is disallowed for crawling.
prompted by https://t.co/i7ouMoqNT6 which was answered by @patrickstox pic.twitter.com/98NLF2twz1
— Gary 鯨理/경리 Illyes (@methode) March 25, 2021
yeah i would not re-disallow the pages as soon as they had been crawled. that seems like a foolish concept
— Gary 鯨理/경리 Illyes (@methode) March 25, 2021
Discussion board dialogue at Twitter.
Replace: Some have questioned this, and this helps clarify it so listed below are these questions:
I believe the identical… if lastly they’ve been de-indexed, why we must always let Google proceed to spend time crawling them? 🧐
— Gianluca Fiorelli wears a 😷. Be like Gianluca (@gfiorelli1) March 26, 2021
Right here is how Patrick responded:
Then simply depart them blocked. These are your states/outcomes.
1 = blocked, noindex = pages listed
2 = unblock noindex = noindex
3 = blocked (once more) noindex = 1. Why swap again? Pages will get listed once more.— Patrick Stox (@patrickstox) March 26, 2021