curl: can't fetch rss from website because of CloudFlare

前端 未结 3 1152
无人及你
无人及你 2020-12-17 02:56

I\'m notable to connect this site http://www.youm7.com/newtkarirrss.asp using curl on the server

But i can access it from localhost with out any pro

相关标签:
3条回答
  • 2020-12-17 03:30

    You need to tell their site what browser your using.

    curl_setopt ($cu, CURLOPT_USERAGENT, $user_agent);
    

    e.g. Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)

    or use the current users own browser agent using $_SERVER['HTTP_USER_AGENT']

    0 讨论(0)
  • 2020-12-17 03:34

    You can pass cloudflare protection with PhantomJS http://phantomjs.org/ which can execute the cloudflare JS outside a browser with following little script "delay.js":

    "use strict";
    var page = require('webpage').create(),
        system = require('system'),
        address, delay;
    
    if (system.args.length < 3 || system.args.length > 5) {
        console.log('Usage: delay.js URL delay');
        phantom.exit(1);
    } else {
        address = system.args[1];
        delay = system.args[2];
        page.open(address, function (status) {
            if (status !== 'success') {
                console.log('Unable to load the address!');
                phantom.exit(1);
            } else {
                window.setTimeout(function () {
                    var content = page.content;
                    console.log(content);
                    phantom.exit();
                }, delay);
            }
        });
    }
    

    run it as phantomjs delay.js http://protected.url 5000

    This will get "protected.url" and wait 5000ms for the cloudflare code to load the real page and dumps it to stdout.

    0 讨论(0)
  • 2020-12-17 03:39

    You can't easily bypass Cloudflare. However you can hack the protection system. :)

    First, parse the page (Cloudflare protection page) and calculate 3+13*7 (most probably this will be different for each request.) in

    $(function(){setTimeout(
                function(){
                    $('#jschl_answer').val(3+13*7);
                    $('#ChallengeForm').submit();
                },
                5850
    )});
    

    Then send post request the same page with "jschl_vc" value from #ChallengeForm which you got from parsed data and "jschl_answer" value as 3+13*7. And then try to fetch the page again with the cookie value that Cloudflare added. When you're added Cloudflare whitelist, you won't see that page anymore.

    0 讨论(0)
提交回复
热议问题