Getting HTML Source with Excel-VBA

六月ゝ 毕业季﹏ 提交于 2020-02-05 15:50:15

问题


I would like to direct an excel VBA form to certain URLs, get the HTML source and store that resource in a string. Is this possible, and if so, how do I do it?


回答1:


Just an addition to the above response. The question was how to get the HTML source which the stated answer does not actually provide.

Compare the contents of oXMLHTTP.responseText with the source code in a browser for URL "http://finance.yahoo.com/q/op?s=T+Options". They do not match and even the returned values are different. (This should be executed after hours to avoid changes during the trading day.)

If I find a way to perform this task the basic code will be posted.




回答2:


Yes. One way to do it is to use the MSXML DLL - and to do that you need to add a reference to the Microsoft XML library via Tools->References.

Here's some code that displays the content of a given URL:

Public Sub ShowHTML(ByVal strURL)
    On Error GoTo ErrorHandler
    Dim strError As String
    strError = ""
    Dim oXMLHTTP As MSXML2.XMLHTTP
    Set oXMLHTTP = New MSXML2.XMLHTTP
    Dim strResponse As String
    strResponse = ""

    With oXMLHTTP
        .Open "GET", strURL, False
        .send ""
        If .Status <> 200 Then
            strError = .statusText
            GoTo CleanUpAndExit
        Else
            If .getResponseHeader("Content-type") <> "text/html" Then
                strError = "Not an HTML file"
                GoTo CleanUpAndExit
            Else
                strResponse = .responseText
            End If
        End If
    End With

CleanUpAndExit:
    On Error Resume Next ' Avoid recursive call to error handler
    ' Clean up code goes here
    Set oXMLHTTP = Nothing
    If Len(strError) > 0 Then ' Report any error
        MsgBox strError
    Else
        MsgBox strResponse
    End If
    Exit Sub
ErrorHandler:
    strError = Err.Description
    Resume CleanUpAndExit
End Sub



回答3:


Compact getHTTP function

Below is a compact & generic function that will return HTTP response from a specified URL to, for example:

  • return the HTML Source of a web page,
  • JSON response from an API URL,
  • parse a text file at a URL, etc.

This does not require any VBA References since MSXML2 is used as a late-bound object.

Public Function getHTTP(ByVal url As String) As String
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", url, False: .Send
        getHTTP = StrConv(.responseBody, vbUnicode)
    End With
End Function

Note that this basic function has no validation or error handling, as those are the parts that can vary considerably depending on which URL you're hitting.

If desired, check the value of .Status after the .Send) to check for success codes like 0 or 200, and also you can setup an error trap with On Error Goto... (never Resume Next!)


Example Usage:

This procedure scrapes this Stack Overflow page for the current score of this question.

Sub demo_getVoteCount()
    Const answerID$ = 2522760
    Const url_SO = "https://stackoverflow.com/a/" & answerID
    Dim html As String, startPos As Long, voteCount As Variant

    html = getHTTP(url_SO)                                  'get html from url

    startPos = InStr(html, "answerid=""" & answerID)        'locate this answer
    startPos = InStr(startPos, html, "vote-count-post")     'locate vote count
    startPos = InStr(startPos, html, ">") + 1               'locate value

    voteCount=Mid(html,startPos,InStr(startPos,html,"<")-startPos) 'extract score
    MsgBox "Answer #" & answerID & " has a score of " & voteCount & "."
End Sub

Of course in reality there are far better ways to get the score of an answer than the example above, such as this way.)



来源:https://stackoverflow.com/questions/2520949/getting-html-source-with-excel-vba

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!