Use HtmlUnit to search google

∥☆過路亽.° 提交于 2019-12-11 18:08:57

问题


The following code is an attempt to search google, and return the results as text or html. The code was almost entirely copied directly from code snippets online, and i see no reason for it to not return results from the search. How do you return google search results, using htmlunit to submit the search query, without a browser?

      import com.gargoylesoftware.htmlunit.WebClient;
      import java.io.*;
      import com.gargoylesoftware.htmlunit.html.HtmlPage;    
      import com.gargoylesoftware.htmlunit.html.HtmlInput;
      import com.gargoylesoftware.htmlunit.html.HtmlSubmitInput;


      import java.net.*;

       public class GoogleSearch {

      public static void main(String[] args)throws IOException, MalformedURLException
      {
        final WebClient webClient = new WebClient();

        HtmlPage page1 = webClient.getPage("http://www.google.com");
        HtmlInput input1 = page1.getElementByName("q");
        input1.setValueAttribute("yarn");

        HtmlSubmitInput submit1 = page1.getElementByName("btnK");

        page1=submit1.click();

        System.out.println(page1.asXml()); 

        webClient.closeAllWindows();
      }
    } 

回答1:


There must be some browser detection that changes the generated HTML, because when inspecting the HTML with page1.getWebResponse().getContentAsString(), the submit button is named btnG and not btnK (which is not what I observe in Firefox). Make this change, and the result will be the expected one.




回答2:


I've just checked this. It's actually 2 ids for 2 google pages:

  • btnK: on the google home page (where there's 1 long textbox in the middle of the screen). This time the button's id = 'gbqfa'
  • btnG: on the google result page (where the main textbox is on top of the screen). This time the button's id = 'gbqfb'


来源:https://stackoverflow.com/questions/7881899/use-htmlunit-to-search-google

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!