Does anyone know how tell the \'facebookexternalhit\' bot to spread its traffic?
Our website gets hammered every 45 - 60 minutes with spikes of approx. 400 requests
I know it's an old, but unanswered, question. I hope this answer helps someone.
There's an Open Graph tag named og:ttl that allows you to slow down the requests made by the Facebook crawler: (reference)
Crawler rate limiting You can label pages and objects to change how long Facebook's crawler will wait to check them for new content. Use the
og:ttlobject property to limit crawler access if our crawler is being too aggressive.
Checking object properties for og:ttl states that the default ttl is 30 days for each canonical URL shared. So setting this ttl meta tag will only slow requests down if you have a very large amount of shared objects over time.
But, if you're being reached by Facebook's crawler because of actual live traffic (users sharing a lot of your stories at the same time), this will of course not work.
Another possibility for you to have too many crawler requests, is that your stories are not being shared using a correct canonical url (og:url) tag.
Let's say, your users can reach certain article on your site from several different sources (actually being able to see and share the same article, but the URL they see is different), if you don't set the same og:url tag for all of them, Facebook will think it's a different article, hence generating over time crawler requests to all of them instead of just to the one and only canonical URL. More info here.
Hope it helps.