order of directives in robots.txt, do they overwrite each other or complement each other?

时光怂恿深爱的人放手 提交于 2019-11-26 21:59:17

问题


User-agent: Googlebot
Disallow: /privatedir/

User-agent: *
Disallow: /

Now, what are disallowed for Googlebot: /privatedir/, or the whole website / ?


回答1:


According to the original robots.txt specification:

  1. A bot must follow the first record that matches its user-agent name.

  2. If such a record doesn’t exist, it must follow the record with User-agent: * (this line may not appear in more than one record).

  3. If such a record doesn’t exist, it doesn’t have to follow any record.

So a bot never follows more than one record.


For your example this means:

  • A bot that matches the name "Googlebot" is not allowed to crawl URLs with a path that starts with /privatedir/.
  • A bot that doesn’t match the name "Googlebot" is not allowed to crawl any URL.


来源:https://stackoverflow.com/questions/45293419/order-of-directives-in-robots-txt-do-they-overwrite-each-other-or-complement-ea

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!