Robots.txt to disallow everything and allow only specific parts of the site/pages. Is “allow” supported by crawlers like Ultraseek and FAST?

左心房为你撑大大i 提交于 2019-12-23 12:03:58

问题


Just wanted to know if it is possible to disallow the whole site for crawlers and allow only specific webpages or sections? Is "allow" supported by crawlers like FAST and Ultraseek?


回答1:


There is an Allow Directive however there's no guarantee that a particular bot will support it (much like there's no guarantee a bot will even check your robots.txt to begin with). You could probably tell by examining your weblogs whether or not specific bots were indexing only the parts of your website that you allow.

The format for allowing just a particular page or section of your website might look like:

Allow: /public/section1/
Disallow: /

This (should) prevent bots from crawling or indexing anything except for content under /public/section1



来源:https://stackoverflow.com/questions/393539/robots-txt-to-disallow-everything-and-allow-only-specific-parts-of-the-site-page

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!