html node parsing with ASP classic

陌路散爱 提交于 2019-12-11 07:43:29

问题


I stucked a day's trying to find a answer: is there a possibility with classic ASP, using MSXML2.ServerXMLHTTP.6.0 - to parse html code and extract a content of a HTML node by gived ID? For example:

remote html file:

<html>
.....
<div id="description">
some important notes here
</div>
.....
</html>

asp code

<%    
    ...
    Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP.6.0")
    objHTTP.Open "GET", url_of_remote_html, False
    objHTTP.Send
    ...
%>

Now - i read a lot of docs, that there is a possibility to access HTML as source (objHTTP.responseText) and as structure (objHTTP.responseXML). But how in a world i can use that XML response to access content of that div? I read and try so many examples, but can not find anything clear that I can solve that.


回答1:


First up, perform the GET request as in your original code snippet:

Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
http.Open "GET", url_of_remote_html, False
http.Send

Next, create a regular expression object and set the pattern to match the inner html of an element with the desired id:

Set regEx = New RegExp
regEx.Pattern = "<div id=""description"">(.*?)</div>"
regEx.Global = True

Lastly, pull out the content from the first submatch within the first match:

On Error Resume Next
contents = regEx.Execute(http.responseText)(0).Submatches(0)
On Error Goto 0

If anything goes wrong and for example the matching element isn't found in the document, contents will be Null. If all went to plan contents should hold the data you're looking for.



来源:https://stackoverflow.com/questions/20311626/html-node-parsing-with-asp-classic

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!