Ruby on Rails, How to determine if a request was made by a robot or search engine spider?

风流意气都作罢 提交于 2019-12-17 22:41:34

问题


I've Rails apps, that record an IP-address from every request to specific URL, but in my IP database i've found facebook blok IP like 66.220.15.* and Google IP (i suggest it come from bot). Is there any formula to determine an IP from request was made by a robot or search engine spider ? Thanks


回答1:


Robots are required (by common sense / courtesy more than any kind of law) to send along a User-Agent with their request. You can check for this using request.env["HTTP_USER_AGENT"] and filter as you please.




回答2:


Since the well behaved bots at least typically include a reference URI in the UA string they send, something like:

request.env["HTTP_USER_AGENT"].match(/\(.*https?:\/\/.*\)/)

is an easy way to see if the request is from a bot vs. a human user's agent. This seems to be more robust than trying to match against a comprehensive list.




回答3:


I think you can use browser gem for check bots.

if browser.bot?
  # code here
end

https://github.com/fnando/browser




回答4:


Another way is to use crawler_detect gem:

CrawlerDetect.is_crawler?("Bot user agent")
=> true

#or after adding Rack::Request extension
request.is_crawler?
=> true

It can be useful if you want to detect a large various of different bots (more than 1000).



来源:https://stackoverflow.com/questions/5882264/ruby-on-rails-how-to-determine-if-a-request-was-made-by-a-robot-or-search-engin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!