robots.txt - is this working?

笑着哭i 提交于 2019-12-02 02:51:19

问题


I just ran into a robots.txt that looks like this:

User-agent: *
Disallow: /foobar

User-agent: badbot
Disallow: *

After disallowing only a few folders for all, does the specific badbot rule even apply?

Note: This question is merely for understanding the above ruleset. I know using robots.txt is not a proper security mechanism and I'm neither using nor advocating it.


回答1:


Each bot only ever complies to at most a single record (block).

A block starts with one or more User-agent lines, typically followed by Disallow lines (at least one is required). Blocks are separated by blank lines.

A bot called "badbot" will look for a record with the line User-agent: badblock (or similar, as the bot "should be liberal in interpreting this field"). If no such line is found, it will look for a record with the line User-agent: *. If even this doesn’t exist, the bot is allowed to do everything (= default).

So in your example, the bot called "badbot" will follow only the second record (you probably mean Disallow: / instead of Disallow: *), while all other bots only follow the first record.



来源:https://stackoverflow.com/questions/24629061/robots-txt-is-this-working

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!