问题
I am using the below code to extract meta 'generator' tag content from a web page using Jsoup:
Elements metalinks = doc.select("meta[name=generator]");
boolean metafound=false;
if(metalinks.isEmpty()==false)
{
metatagcontent = metalinks.first().select("content").toString();
metarequired=metatagcontent;
metafound=true;
}
else
{
metarequired="NOT_FOUND";
metafound=false;
}
The problem is that for a page that does contain the meta generator tag, no value is shown (when I output the value of variable 'metarequired'. For a page that does not have meta generator tag, the value 'NOT_FOUND' is shown correctly. What am I doing wrong here?
回答1:
From your code,
metalinks.first().select("content").toString();
This is not correct. This is merely selecting
<meta ...>
<content ... /> <!-- This one, which of course doesn't exist. -->
</meta>
while you actually want to get the attribute
<meta ... content="..." />
You need to use attr("content") instead of select("content").
metatagcontent = metalinks.first().attr("content");
See also:
- Jsoup cookbook - Selector syntax
- Jsoup Selector API documentation
- W3 CSS3 selector specification
Unrelated to the concrete problem, you don't need to test against a boolean
inside an if
block. The isEmpty()
already returns a boolean
:
if (!metalinks.isEmpty())
来源:https://stackoverflow.com/questions/8296520/how-to-extract-the-content-attribute-of-the-meta-name-generator-tag