How to extract the content attribute of the meta name=generator tag?

我只是一个虾纸丫 提交于 2019-12-13 14:26:59

问题


I am using the below code to extract meta 'generator' tag content from a web page using Jsoup:

Elements metalinks = doc.select("meta[name=generator]");
boolean metafound=false;

if(metalinks.isEmpty()==false)
{ 
    metatagcontent = metalinks.first().select("content").toString();
    metarequired=metatagcontent;
    metafound=true;
}
else 
{
    metarequired="NOT_FOUND";
    metafound=false;
}

The problem is that for a page that does contain the meta generator tag, no value is shown (when I output the value of variable 'metarequired'. For a page that does not have meta generator tag, the value 'NOT_FOUND' is shown correctly. What am I doing wrong here?


回答1:


From your code,

metalinks.first().select("content").toString();

This is not correct. This is merely selecting

<meta ...>
    <content ... /> <!-- This one, which of course doesn't exist. -->
</meta>

while you actually want to get the attribute

<meta ... content="..." />

You need to use attr("content") instead of select("content").

metatagcontent = metalinks.first().attr("content");

See also:

  • Jsoup cookbook - Selector syntax
  • Jsoup Selector API documentation
  • W3 CSS3 selector specification

Unrelated to the concrete problem, you don't need to test against a boolean inside an if block. The isEmpty() already returns a boolean:

if (!metalinks.isEmpty())


来源:https://stackoverflow.com/questions/8296520/how-to-extract-the-content-attribute-of-the-meta-name-generator-tag

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!