发表新帖

发表新帖

Facebook and Crawl-delay in Robots.txt?

前端未结

关注

 5  904

旧时难觅i 2021-01-02 03:39

Does Facebook\'s webcrawling bots respect the Crawl-delay: directive in robots.txt files?

5条回答

误落风尘 (楼主)

2021-01-02 04:40

Facebook actually uses this algorithm that you can check for yourself here:

http://developers.facebook.com/tools/debug

Facebook cache lifespan of this data is variable, but it's between 24-48hours from my experience.

You -can- however make the cache "invalidate" if you add a portion to your url so that users will share the new one, OR you can provide bit.ly (and the like) links that will have the same effect.

Since it's not actually crawling, you can't force it to delay a scrape (and you shouldn't, as this would create bad user experience - they would wait a while for the scraper to finish and they would be provided with a shareable link that is not pretty). You COULD however trigger manually the scraping at set intervals so as to ensure better user experience (they wouldn't wait for data to be cached) and server load balancing.

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题