How to get Content-type using html simple dom?

喜夏-厌秋 提交于 2020-01-06 19:48:10

问题


I tried find('meta[http-equiv="Content-type"]') but it failed to retrieve that information.


回答1:


SimpleHTMLDom doesn't use quoted string literals in the selector. It's just elem[attr=value]. And the comparison of value seems to be case-sensitive (there may be a way to make it case-insensitive, but that I don't know)*

E.g.

require 'simple_html_dom.php';
$html = file_get_html('http://www.google.com/');
// most likely one one element but foreach doesn't hurt
foreach( $html->find('meta[http-equiv=content-type]') as $ct ) { 
  echo $ct->content, "\n";
}

prints text/html; charset=ISO-8859-1.

*edit: yes, there is a way to perform a case-insensitive match, use *= instead of =

find('meta[http-equiv*=content-type]')

edit2: btw that http-equiv*=content-type thingy would also match <meta http-equiv="haha-no-content-types"... (it only tests if the string is somewhere in the attribute's value). But it's the only case-insensitive function/operator I could find. I guess you can live with it in this case ;-)
edit 3: It uses preg_match('.../i') and the pattern/selector is directly passed to that function. Therefore you could do something like http-equiv*=^content-type$ to match http-equiv="Content-type" but not http-equiv="xyzContent-typeabc". But I don't know if this is a warranted feature.




回答2:


The Content-Type is typically part of the http-response headers - not in the body. Where did you get the xml document from?




回答3:


I would go foreach on $this->find('meta'); in case of differently written content-type - I think that browsers aren't in this case case sensitive, while php might be.



来源:https://stackoverflow.com/questions/2213675/how-to-get-content-type-using-html-simple-dom

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!