问题
i try parsing google for search results. What i need are not the search results themselves, but instead i need the information whether a search result exists or not!
Now my problem is i want to search for combined strings. E.g. "Max Testperson". Now google is really nice and tells me: We could not find search results for "Max Testperson" but instead for Max Testperson. But !!! I do not need Max Testperson, i need "Max Testperson".
So basically i am not interested in the search results themselves, but instead into the part before the search results (Whether a search string can be found or not!).
I used the following tutorial in java: http://mph-web.de/web-scraping-with-java-top-10-google-search-results/
With this i can parse the search results. But like i said! No need for that! I just want to know if my search string exists or not. Since google removes the ->" "<- i get search results anyways.
Can anyone help me out with this?
回答1:
Try to add the get parameter nfpr=1
to your search to disable the auto-correction feature:
final Document doc = Jsoup.connect("https://google.com/search?q=test"+"&nfpr=1").userAgent(USER_AGENT).get();
Update:
You could parse for the message regarding no result:
public class App {
public static final String USER_AGENT = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36";
public static void main(String[] args) throws Exception {
String searchTerm = "\"daniel+nasseh\"+\"26.02.1987\"";
boolean hasExactResults = true;
final Document doc = Jsoup.connect("https://google.com/search?q=" + searchTerm + "&nfpr=1")
.userAgent(USER_AGENT).get();
Elements noResultMessage = doc.select("div.e.obp div.med:first-child");
if (!noResultMessage.isEmpty()) {
hasExactResults = false;
for (Element result : noResultMessage) {
System.out.println(result.text());
}
}
if (hasExactResults) {
// Traverse the results
for (Element result : doc.select("h3.r a")) {
final String title = result.text();
final String url = result.attr("href");
System.out.println(title + " -> " + url);
}
}
}
}
Update 2: best solution as presented from Donselm himself in the comments is to add &tbs=li:1
to force the search for the exact search term
String searchTerm = "\"daniel+nasseh\"+\"26.02.1987\"";
final Document doc = Jsoup.connect("https://google.com/search?q=" + searchTerm + "&tbs=li:1").userAgent(USER_AGENT).get();
来源:https://stackoverflow.com/questions/37268406/getting-information-whether-a-google-search-results-exists-or-not-java