Why google index this? [closed]

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-08 10:38:24

问题


In this webpage:

http://www.alvolante.it/news/pompe_benzina_%E2%80%9Ctruccate%E2%80%9D_autostrada-308391044

there is this image:

http://immagini.alvolante.it/sites/default/files/imagecache/anteprima_100/images/rifornimento_benzina.jpg

Why this image is indexed if in the robots.txt there is "Disallow: /sites/" ??

You can see that is indexed from this search:

http://www.google.it/images?q=rifornimento+benzina&um=1&ie=UTF-8&source=og&sa=N&hl=it&tab=wi&biw=1280&bih=712


回答1:


Because of the different domain names (actually a domain and a subdomain): the page is from http://www.alvolante.it and the image is from http://immagini.alvolante.it.

Robots.txt is only in the www domain. If the file would be also in http://immagini.alvolante.it/ the Google wouldn't indexed the image.

Try to access http://immagini.alvolante.it/sites and http://www.alvolante.it/sites and you will see different pages.




回答2:


With google WebMaster Tools you can test your robots.txt.

http://www.google.com/webmasters/




回答3:


Have you disallowed all bots, or is this rule just for the Googlebot? If it's the latter, you need to ensure that you also include the rule for the 'Googlebot-Image' user agent.



来源:https://stackoverflow.com/questions/3862702/why-google-index-this

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!