VBA-Web Scraping- Can't acces table web page

房东的猫 提交于 2020-12-15 04:57:27

问题


I tried to scrape the data prices table in this web https://www.energylive.cloud/ , like I did in other webs, but I can't (I don't have much experience scraping). Thanks in advance!!!:

Sub ej()

Dim XMLrequest As New MSXML2.XMLHTTP60
Dim HTMLdoc As New MSHTML.HTMLDocument
Dim HTMLtable As MSHTML.IHTMLElement
'Dim HTMLi As MSHTML.IHTMLElementCollection


Dim url As String

url = "https://www.energylive.cloud/"

XMLrequest.Open "GET", url, False   
XMLrequest.send

If XMLrequest.Status <> 200 Then    
    MsgBox XMLrequest.Status & XMLrequest.statusText
End If

HTMLdoc.body.innerHTML = XMLrequest.responseText

'debug.print htmldoc.body.innerText    'I checked here but the table is not here

Set HTMLtable = HTMLdoc.getElementById("price_table")


    'Debug.Print HTMLtable.ID


End Sub

回答1:


The content you look for is not available in that page. It's added dynamically. This is the link where you can find the desired content which are static that you can grab using xhr. To find out that link you need to make use of chrome dev tools or something similar. After opening dev tools, select network tab and then try reloading the page to observe network activity within All or xhr where you should find that link.

It's not that easy to parse required content out of json response especially when you are using vba as there is no such built-in library to help you grab them. The more common approach though is to go for any third party json converter.

However, I've used regex here which seems to have grabbed the data flawlessly. When you run the script, you should get all the tabular content with the blink of an eye.

Sub FetchTabularData()
    Const mainUrl$ = "https://www.energylive.cloud/pwr-hour/get-index-averages?callback=%3F"
    Dim I&, S$, Elem As Object, subElemName As Object
    Dim subElemChange As Object, subElemPrice As Object
    Dim subElemMtd As Object, subElemYtd As Object, R As Long: R = 1
    Dim ws As Worksheet: Set ws = ThisWorkbook.Worksheets("Sheet1")

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", mainUrl, False
        .send
        S = .responseText
    End With
    
    ws.Range("A1:E1") = [{"Index","Value","Changes","Month To Date","Year To Date"}]
    
    With CreateObject("VBScript.RegExp")
        .Global = True
        .MultiLine = True

        .Pattern = "\[?{[\s\S]+?\},?"
        Set Elem = .Execute(S)
        For I = 0 To Elem.count - 1
            .Pattern = "Index""\:""(.*?)"","
            Set subElemName = .Execute(Elem(I))
            .Pattern = "Value""\:""(.*?)\"","
            Set subElemPrice = .Execute(Elem(I))
            .Pattern = "Perc""\:""(.*?)"","
            Set subElemChange = .Execute(Elem(I))
            .Pattern = "Month-to-date""\:""(.*?)"","
            Set subElemMtd = .Execute(Elem(I))
            .Pattern = "Year-to-date""\:""(.*?)"""
            Set subElemYtd = .Execute(Elem(I))

            R = R + 1: ws.Cells(R, 1) = subElemName(0).submatches(0)
            ws.Cells(R, 2) = subElemPrice(0).submatches(0)
            ws.Cells(R, 3) = subElemChange(0).submatches(0) & "%"
            ws.Cells(R, 4) = subElemMtd(0).submatches(0)
            ws.Cells(R, 5) = subElemYtd(0).submatches(0)
        Next I
    End With
End Sub

PS You don't need to add any reference to the library to execute the above script. Just make sure you have a sheet named Sheet1 in your excel workbook.



来源:https://stackoverflow.com/questions/65096455/vba-web-scraping-cant-acces-table-web-page

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!