Blocking all bots except a few with Nginx

前端 未结 2 1856
我在风中等你
我在风中等你 2020-12-28 09:42

I would like to block all the http_user_agents which identify as bots but allow Googlebot when I put the following code:

map $http_user_agent $bad_bot {
defa         


        
相关标签:
2条回答
  • 2020-12-28 10:25

    Simply check $http_user_agent against your $bad_bot list and return HTTP 403 if it's in your blacklist:

    location / {
       if ($http_user_agent ~ (libwww|Wget|LWP|damnBot|BBBike|java|spider|crawl) ) {
           return 403;
       }
    }
    

    Note: ~ in the if block performs case-sensitive match. If you want to make your blacklist case-insensitive, use ~* instead of ~.

    0 讨论(0)
  • 2020-12-28 10:39

    Here my logic for nginx

    map $http_user_agent $limit_bots {
         default 0;
         ~*(google|bing|yandex|msnbot) 1;
         ~*(AltaVista|Googlebot|Slurp|BlackWidow|Bot|ChinaClaw|Custo|DISCo|Download|Demon|eCatch|EirGrabber|EmailSiphon|EmailWolf|SuperHTTP|Surfbot|WebWhacker) 1;
         ~*(Express|WebPictures|ExtractorPro|EyeNetIE|FlashGet|GetRight|GetWeb!|Go!Zilla|Go-Ahead-Got-It|GrabNet|Grafula|HMView|Go!Zilla|Go-Ahead-Got-It) 1;
         ~*(rafula|HMView|HTTrack|Stripper|Sucker|Indy|InterGET|Ninja|JetCar|Spider|larbin|LeechFTP|Downloader|tool|Navroad|NearSite|NetAnts|tAkeOut|WWWOFFLE) 1;
         ~*(GrabNet|NetSpider|Vampire|NetZIP|Octopus|Offline|PageGrabber|Foto|pavuk|pcBrowser|RealDownload|ReGet|SiteSnagger|SmartDownload|SuperBot|WebSpider) 1;
         ~*(Teleport|VoidEYE|Collector|WebAuto|WebCopier|WebFetch|WebGo|WebLeacher|WebReaper|WebSauger|eXtractor|Quester|WebStripper|WebZIP|Wget|Widow|Zeus) 1;
         ~*(Twengabot|htmlparser|libwww|Python|perl|urllib|scan|Curl|email|PycURL|Pyth|PyQ|WebCollector|WebCopy|webcraw) 1;
     } 
    
    location / {
      if ($limit_bots = 1) {
        return 403;
      }
    }
    
    0 讨论(0)
提交回复
热议问题