I have a website I am developing that is also going to be pulled into a web app. I have the following code in my .htaccess
file to prevent access from ANYONE th
Allow from
and Rewrite*
are directives from two different Apache's modules.
The first one is mod_authz_host
and the other from mod_rewrite
.
You can use mod_rewrite
to do what you want:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !=myuseragent
RewriteRule .* - [F,L]
I just want to allow ONE SPECIFIC user agent rather than trying to block all
Hi
What you need to consider here is that some bots (especially "larger" more prominent ones) will use several user-agents to access your site. For example, a Googlebot (crawler) can use all this different user-agents:
Googlebot-Image/1.0
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1;+htt://www.google.com/bot.html)
GoogleProducer
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
Google-Site-Verification/1.0
Google-Test
Googlebot/2.1 (+http://www.google.com/bot.html)
and I`m not event talking about Google Plus and many other bots used by Google.
Same goes for Yahoo and others.
Just this week our company (Incapsula) launched Botopedia.org - a Community-Sourced bot directory. It's 100% free and open for all and you can use it to find a complete user-agent list for all bots you`ll want to Allow.
If needed, it also has a Reverse IP functionality for Bot verification because, as our recent study of Fake Googlebot visits has shown, some spammer and even cyber-attackers will use legitimate bot signatures to ease their way into your site.
Hope this helps.
If you don't want to use mode_rewrite, with Apache 2.4 you can use something similar to this:
<Location />
AuthType Basic
AuthName "Enter Login and Password to Enter"
AuthUserFile /home/content/html/.htpasswd
<If "%{HTTP_USER_AGENT} == 'myuseragent'">
Require all granted
</If>
<Else>
Require valid-user
Require ip 12.34.567.89
</Else>
</Location>
SetEnvIfNoCase User-Agent .*google.* search_robot
SetEnvIfNoCase User-Agent .*yahoo.* search_robot
SetEnvIfNoCase User-Agent .*bot.* search_robot
SetEnvIfNoCase User-Agent .*ask.* search_robot
Order Deny,Allow
Deny from All
Allow from env=search_robot
Htaccess SetEnvIf and SetEnvIfNoCase Examples
I just want to allow ONE SPECIFIC user agent rather than trying to block all
Here's my config to allow only wget:
SetEnvIf User-Agent .*Wget* wget
Order deny,allow
Deny from all
Allow from env=wget