WebView Crawler navigate to url based on page result

*爱你&永不变心* 提交于 2020-07-19 07:17:09

问题


I'm trying to build a web crawler based on the requirements that were described here, and I figured WebView would be the most suitable way to implement this.

The problem seems to emerge when the next URL I need to visit is based on the HTML contents of the current page.
I am using view.evaluateJavascript to get the current page HTML and parse the URL part inside onReceiveValue, but then there is no way for me to navigate to the URL because onReceiveValue cannot access the view.

Also, using loadUrl in onPageFinished does not work as well, because it is done even before I retrieve the HTML content, thus navigating to the page with a null value

WebView myWebView = new WebView(this);
setContentView(myWebView);

myWebView.getSettings().setJavaScriptEnabled(true);
MyJavaScriptInterface jInterface = new MyJavaScriptInterface(this);
myWebView.addJavascriptInterface(jInterface, "HTMLOUT");

myWebView.setWebViewClient(new WebViewClient() {
 @Override
 public void onPageFinished(WebView view, String url) {
  super.onPageFinished(view, url);
  if (url.equals("http://url.com")) {
   final String[] versionString = {
    null
   };
   view.evaluateJavascript("(function(){return window.document.body.outerHTML})();",
    new ValueCallback < String > () {
     @Override
     public void onReceiveValue(String html) {
      String result = removeUTFCharacters(html).toString();
      Matcher m = r.matcher(result);
      versionString[0] = m.group(1);
     }
    });
   String getFullUrl = String.format("https://url.com/getData?v=%s", versionString[0]);
   view.loadUrl(getFullUrl);
  }
 }
});
myWebView.loadUrl("http://url.com");

回答1:


Call your url from onReceiveValue

 myWebView.setWebViewClient(new WebViewClient() {
        @Override
        public void onPageFinished(WebView view, String url) {
            super.onPageFinished(view, url);
            if (url.contains("https://www.google.com")) {
                final String[] versionString = {
                        null
                };
                view.evaluateJavascript("(function(){return window.document.body.outerHTML})();",
                        new ValueCallback< String >() {
                            @Override
                            public void onReceiveValue(String html) {

                                String getFullUrl = String.format("https://cchat.in", versionString[0]);
                                view.loadUrl(getFullUrl);
                            }
                        });

            }
        }
    });
    myWebView.loadUrl("https://www.google.com");

I used 2 website to demonstrate. I am able to call 2nd URL from onReceiveValue.

You can try this.



来源:https://stackoverflow.com/questions/62726565/webview-crawler-navigate-to-url-based-on-page-result

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!