How do I get content from a table using its ID with a regex?

时光总嘲笑我的痴心妄想 提交于 2019-12-08 03:57:10

问题


I need to sort a html string so I get the content I need. Now I need to loop through the table rows in a table that have an ID. How do I do this with a regex?


回答1:


Regular expressions cannot be used to parse HTML; HTML is not regular. Use a proper HTML parser library.




回答2:


It depends on how regular the HTML text is. For example, given this table:

<table>
  <tr><td>1</td><td>Apple</td></tr>
  <tr><td>2</td><td>Ball</td></tr>
  <tr><td>3</td><td>Cookie</td></tr>
<table>

The following regex expression finds the IDs in the first column:

(?<=<tr><td>).*?(?=</td>)



回答3:


If you run the page through an html-parser like BeautifulSoup, then you can prettify it so that this kind of regex has a chance. But if you are parsing the html anyway...




回答4:


Try this

Dim HTML As String = contentText
Dim options As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.Singleline
Dim regex As Regex = New Regex("<table[^>]*>(.*)</table>", options)
Dim match As MatchCollection = regex.Matches(HTML)
Dim sb As StringBuilder = New StringBuilder
For Each items As Match In match
    sb.Append(items.ToString & vbLf)
Next
TextBox.Text = sb.ToString


来源:https://stackoverflow.com/questions/2085110/how-do-i-get-content-from-a-table-using-its-id-with-a-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!