PHP Curl UTF-8 Charset

后端 未结 6 1852
囚心锁ツ
囚心锁ツ 2020-12-05 06:54

I have an php script which calls another web page and writes all the html of the page and everything goes ok however there is a charset problem. My php file encoding is utf-

6条回答
  •  粉色の甜心
    2020-12-05 07:45

    function page_title($val){
        include(dirname(__FILE__).'/simple_html_dom.php');
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL,$val);
        curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0');
        curl_setopt($ch, CURLOPT_ENCODING , "gzip");
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_HEADER, 0);
        $return = curl_exec($ch); 
        $encot = false;
        $charset = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
    
        curl_close($ch); 
        $html = str_get_html('"'.$return.'"');
    
        if(strpos($charset,'charset=') !== false) {
            $c = str_replace("text/html; charset=","",$charset);
            $encot = true;
        }
        else {
            $lookat=$html->find('meta[http-equiv=Content-Type]',0);
            $chrst = $lookat->content;
            preg_match('/charset=(.+)/', $chrst, $found);
            $p = trim($found[1]);
            if(!empty($p) && $p != "")
            {
                $c = $p;
                $encot = true;
            }
        }
        $title = $html->find('title')[0]->innertext;
        if($encot == true && $c != 'utf-8' && $c != 'UTF-8') $title = mb_convert_encoding($title,'UTF-8',$c);
    
        return $title;
    }
    

提交回复
热议问题