问题
I have been attempting to use IE automation to google search a string of text in Excel. I want to return the hyperlink for the website of the first result in another cell in excel. Is this possible? I have a list of 60,000 records that I need to google search and return the hyperlink for the website in the first result. Is there another approach to this that you would reccomend? I appreciate the help in advance.
回答1:
As its 60,000 records i recommend use xmlHTTP object instead of using IE.
HTTP requests a easier, and a lot faster
Download the sample file
Sub XMLHTTP()
Dim url As String, lastRow As Long, i As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date
lastRow = Range("A" & Rows.Count).End(xlUp).Row
Dim cookie As String
Dim result_cookie As String
start_time = Time
Debug.Print "start_time:" & start_time
For i = 2 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.getelementbyid("rso")
Set objH3 = objResultDiv.getelementsbytagname("h3")
For Each link In objH3
If link.className = "r" Then
Cells(i, 2) = link.innerText
Cells(i, 3) = link.getelementsbytagname("a")(0).href
DoEvents
End If
Next
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time) & " :minutes"
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
Using CSS3 Selector
Sub XMLHTTP1()
Dim url As String, i As Long, lastRow As Long
Dim XMLHTTP As Object, html As New HTMLDocument, objResultDiv As HTMLAnchorElement
lastRow = Range("A" & Rows.Count).End(xlUp).Row
For i = 2 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = New HTMLDocument
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.querySelector("div#rso h3.r a")
Cells(i, 2) = objResultDiv.innerText
Cells(i, 3) = objResultDiv.href
DoEvents
Next
End Sub
Output
HTH
Santosh
回答2:
The links seem to be consistently within within H3 tags. Normally you might use something like the following to check until the page has loaded:
Private Declare Sub Sleep Lib "kernel32" (ByVal nMilliseconds As Long)
Sub UseIE()
Dim ie As Object
Dim thePage As Object
Dim strTextOfPage As String
Set ie = CreateObject("InternetExplorer.Application")
'ie.FullScreen = True
With ie
'.Visible = True
.Navigate "http://www.bbc.co.uk"
While Not .ReadyState = READYSTATE_COMPLETE '4
Sleep 500 'wait 1/2 sec before trying again
Wend
End With
Set thePage = ie.Document
'more code here
End Sub
However, I would, instead, repeatedly try to reference the A element within the first H3 using getElementsByTagName("H3"), get the first of these elements, then look within this for the A-link and its href-attribute.
In JavaScript the attempts to reference non-existent elements would return undefined but from VBA it will probably need error-handling code.
Once I had obtained the href I would stop the navigation (not sure of the command for this, probably ie.Stop) or navigate to the next page immediately.
The first link(s) will, however, often be sponsored links and the href returned is a little garbled. The text of these sponsored links appear to include em tags. I might use this information to discard these links and look further down the page.
I don't know if there is a better way to do this.
来源:https://stackoverflow.com/questions/17495644/using-vba-in-excel-to-google-search-in-ie-and-return-the-hyperlink-of-the-first