How to automate and export HTML searchable query data into excel [closed]

浪子不回头ぞ 提交于 2019-12-10 18:33:02

问题


A webpage I am interested in extracting data from has a table with multiple search fields. I can enter data into any of these fields and click the search button at the bottom of the table and see the results based on the information I wanted to search for.

I have multiple numbers i want to search for (around 300), instead of searching each of these individually, is there a way to automate searching the data and import the data into an excel worksheet for each number I want to search?

is it possible using an Excel macro?


回答1:


You can use the MSXML and MSHTML libraries for this. This code should get you started.
Start by running this sub to add both references (you only need to run this once):

Sub addReferences()
    ActiveWorkbook.VBProject.References.AddFromGuid "{3050F1C5-98B5-11CF-BB82-00AA00BDCE0B}", 4, 0
    ActiveWorkbook.VBProject.References.AddFromGuid "{F5078F18-C551-11D3-89B9-0000F81FE221}", 6, 0
End Sub

Then edit the getCAGEValues sub to import your CAGE codes and save the resulting data (and any additional data you want from the page):

Sub getCAGEValues()
    Dim oHTMLDoc As MSHTML.HTMLDocument
    Dim oSpan As MSHTML.HTMLGenericElement
    Dim CAGECodes() As Variant
    CAGECodes = Array("12345", "12346") 'CAGECodes is an array of your codes'
    For Each CAGECode In CAGECodes
        Set oHTMLDoc = getPage(CAGECode)
        Set oSpan = oHTMLDoc.getElementById("ctl00_cphMainPageBody_lblCompNameData") 'The id for the company name'
        MsgBox oSpan.innerText 'Save the value however you want to.'
    Next
End Sub

Function getPage(CAGECode As Variant) As MSHTML.HTMLDocument
    Dim oHttpRequest As MSXML2.XMLHTTP60
    Set oHttpRequest = New MSXML2.XMLHTTP60
    With oHttpRequest
        .Open "GET", "http://www.logisticsinformationservice.dla.mil/BINCS/details.aspx?CAGE=" & CAGECode, False
        .setRequestHeader "Cache-Control", "no-cache"
        .setRequestHeader "Pragma", "no-cache"
        .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
        .send
    End With
    Dim oHTMLDoc As MSHTML.HTMLDocument
    Set oHTMLDoc = New MSHTML.HTMLDocument
    oHTMLDoc.body.innerHTML = oHttpRequest.responseText
    Set getPage = oHTMLDoc
End Function


来源:https://stackoverflow.com/questions/13364172/how-to-automate-and-export-html-searchable-query-data-into-excel

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!