How to extract source from Google search result “20-pack” entry?

谁说胖子不能爱 提交于 2019-12-06 15:36:52

I have tried to check the dom structure of the page you provided. Basically IE has huge differences on such a page with Firefox(IE will direct to another page once you've clicked the left-hand-side items.)

But due to my environmental limit, I can just have done this for IE. For firefox, you may have a try on the following code. There might be minor issues(apologize, I am unable to test it ).

Note: I wrote a java demo(Just for searching Phone num) because I am familiar with java. And I am also not good at cssSelector so I used xpath instead. Hope it can help.

        driver.get("https://www.google.com/search?q=chiropractors%2Bnew%20york%2Bny&rflfq=1&tbm=lcl&tbs=lf:1,lf_ui:2&oll=40.754671143320074,-73.97722375000001&ospn=0.017814865199625274,0.040340423583984375&oz=15&fll=40.75807315356519,-73.99290368792725&fspn=0.01641614335274255,0.040340423583984375&fz=15&ved=0CJIBENAnahUKEwj1jtnmtcbHAhVTCo4KHfkkCYM&bav=on.2,or.r_cp.&biw=1360&bih=608&dpr=1&sei=y4LdVYvcFsa7uATo_LngCQ&ei=4YTdVbWaENOUuAT5yaSYCA&emsg=NCSR&noj=1&rlfi=hd:;si:#emsg=NCSR&rlfi=hd:;si:&sei=y4LdVYvcFsa7uATo_LngCQ");

        //0. Actually no need unless you have low connection speed with google.
        Thread.sleep(5000);


        //1. By xpath '_gt' will extract all of the 20 results' div on left hand side. Both IE and firefox can work well. 
        List<WebElement> elements = driver.findElements(By.xpath("//div[@class='_gt']"));

        //2. Traverse all of the results. Let 'data-cid' as identifier. Note: Only FF can be done. For IE there are no data-cid s
        for(int i=0; i<elements.size(); i++) {
            WebElement e = elements.get(i);


            WebElement aTag = e.findElement(By.tagName("a"));


            String dataCid = aTag.getAttribute("data-cid");


            //3. Here, the div which contains the info we want can be identified by 'data-cid' in firefox
            WebElement parentDivOfTable = driver.findElement(By.xpath("//div[@class='akp_uid_0' and @data-cid='" + dataCid + "']"));

            //4. get the infomation table.
            WebElement table = parentDivOfTable.findElement(By.xpath("//table[@class='_B5g']"));

            //get the phone num.
            String phoneNum = table.findElement(By.xpath("//span[text()='Phone:']/following-sibling")).getText();
        }
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!