DOMDocument::loadHTMLFile() modify user agent

夙愿已清 提交于 2019-12-22 12:09:11

问题


Im using PHP to load a website in a DOM Tree. Is there a way to modify the user agent that is sent using DOMDocument::loadHTMLFile()?

function parseThis($url)
{
  $html = new DOMDocument();
  $html->loadHtmlFile( $url );

  return $html
}

回答1:


Change the user_agent value in php.ini, which should be sent in anything making use of the http stream wrapper like DOMDocument::loadHtmlFile(), file_get_contents(), etc.

$fake_user_agent = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11";
ini_set('user_agent', $fake_user_agent);

The same can also be accomplished in an Apache .htaccess by setting php_value user_agent if permitted by your server configuration.




回答2:


Well, I think the best way to do is to retrieve the content in a different way and load the document after. You can do that using cURL.

$useragent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";

$ch = curl_init();

// set user agent
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_HEADER, 0);

// grab content from the website
$content = curl_exec($ch);

// load the content in your dom
$html = new DOMDocument();
$html->loadHTML($content);


来源:https://stackoverflow.com/questions/11495961/domdocumentloadhtmlfile-modify-user-agent

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!