Facebook and Crawl-delay in Robots.txt?

前端 未结 5 904
旧时难觅i
旧时难觅i 2021-01-02 03:39

Does Facebook\'s webcrawling bots respect the Crawl-delay: directive in robots.txt files?

5条回答
  •  误落风尘
    2021-01-02 04:40

    Facebook actually uses this algorithm that you can check for yourself here:

    http://developers.facebook.com/tools/debug

    Facebook cache lifespan of this data is variable, but it's between 24-48hours from my experience.

    You -can- however make the cache "invalidate" if you add a portion to your url so that users will share the new one, OR you can provide bit.ly (and the like) links that will have the same effect.

    Since it's not actually crawling, you can't force it to delay a scrape (and you shouldn't, as this would create bad user experience - they would wait a while for the scraper to finish and they would be provided with a shareable link that is not pretty). You COULD however trigger manually the scraping at set intervals so as to ensure better user experience (they wouldn't wait for data to be cached) and server load balancing.

提交回复
热议问题