Scrape with xmlhttp

╄→尐↘猪︶ㄣ 提交于 2021-02-20 03:50:41

问题


I would like to get data from https://www.goaloong.net/football/6in1 This page contains a table.

I tried with:

Sub REQUESTXML()

Dim XMLHttpRequest As xmlHttp
Dim HTMLDoc As New HTMLDocument
Dim elem As Object
Dim x As Long

Set XMLHttpRequest = New MSXML2.xmlHttp
XMLHttpRequest.Open "GET", "https://www.goaloong.net/football/6in1", False
XMLHttpRequest.send
While XMLHttpRequest.readyState = 200
    DoEvents
Wend

Debug.Print XMLHttpRequest.responseText
HTMLDoc.Body.innerHTML = XMLHttpRequest.responseText

x = 1

For Each elem In HTMLDoc.getElementsByClassName("Leaguestitle")

    Sheets("req").Range("A" & x).Value = HTMLDoc.getElementsByTagName("a")(0).innerText
    
 x = x + 1
 
 Next elem

 End Sub

I have no result.

Kindly help me?


回答1:


The page https://www.goaloong.net/football/6in1 is dynamic, i.e. first the java scripts are loaded, then the scripts are loading the content. One approach is to load the full page content in IE and get it out of it. Example below (tested):

Sub REQUESTXML()
    Dim IE As New InternetExplorer
    Dim elem As Object
    Dim x As Long
    
    IE.navigate "https://www.goaloong.net/football/6in1"
    
    Do While IE.readyState = READYSTATE_COMPLETE: DoEvents: Loop
    Do Until IE.readyState = READYSTATE_COMPLETE: DoEvents: Loop
    
    'for debug purpose
    Open ThisWorkbook.Path & "\TESTFILE.html" For Output As #1
    Print #1, IE.document.body.innerHTML
    Close #1
    
    x = 1
    For Each elem In IE.document.getElementsByClassName("Leaguestitle")
        Sheets(1).Range("A" & x).Value = elem.innerText
        x = x + 1
    Next elem

    IE.Quit
End Sub



回答2:


If you're ok with using a DLL and rewrite your code, you can run Microsoft's Edge browser (a Chrome-based browser) with VBA. With that you can do almost anything you want. Note however, that access to the DOM is performed by javascript, not by an object like Dim IE As New InternetExplorer. Look at the VBA sample and you'll get the grasp.

https://github.com/peakpeak-github/libEdge

Sidenote: Samples for C# and C++ are also included.



来源:https://stackoverflow.com/questions/66045395/scrape-with-xmlhttp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!