问题
How can I disallow someone from using curl
or file_get_contents
to get my page HTML?
For example, my domain is www.example.com
. If someone has PHP code like this:
<?php
$info = file_get_contents('http://www.example.com/theinfo.php');
?>
how can I block them?
I can try to check by the user agent, but that’s not the right way.
What is the best way to check when someone is trying to get the page content?
What I built contains information that many will try to copy to their own websites, and it can overload my server.
回答1:
i can try to check it by user agent but its not the right way.
The user agent can indeed be changed through curl but that's pretty much the only way you can tell if someone is accessing your site through curl or not. There's nothing else that's part of the request that distinguishes between them.
That being said, you could try to look for some missing fields, as file_get_contents() by default leaves out a bunch of them:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
RewriteCond %{HTTP_ACCEPT} ^$
RewriteRule ^ - [L,F]
though you do run the slight risk of hitting false negatives.
回答2:
If you are concerned about anyone, not a specific IP or domain, taking your content, you should implement some kind of registration process for your site. Using Apache to filter will probably cause more problems than it is worth. You should ask yourself if what you are putting on the internet is not actually meant for every man and machine out there to do as they please, it should either be login protected or not on the internet.
Here is a very simple to use PHP library for implementing a login and/or registration system: https://github.com/panique/php-login
回答3:
Use .htaccess with the corresponding IP address of the site (example.com). Paste this code in your .htaccess:
order allow,deny
deny from 123.45.67.89
allow from all
来源:https://stackoverflow.com/questions/19814509/how-to-block-curl-or-file-get-contents