Could somebody please provide an example of parsing HTML into a list of elements using XMLWorkerHelper in iTextSharp (C#).
The JAVA version as given in the documenta
You need to implement the IElementHandler
interface in a class of your own:
public class SampleHandler : IElementHandler {
//Generic list of elements
public List elements = new List();
//Add the supplied item to the list
public void Add(IWritable w) {
if (w is WritableElement) {
elements.AddRange(((WritableElement)w).Elements());
}
}
}
Instead of using the file stream here's an example parsing a string. To use a file replace the StringReader
with a StreamReader
.
string html = "Test Document This is a test. Bold and italic
- Dog
- Cat
";
//Instantiate our handler
var mh = new SampleHandler();
//Bind a reader to our text
using (TextReader sr = new StringReader(html)) {
//Parse
XMLWorkerHelper.GetInstance().ParseXHtml(mh, sr);
}
//Loop through each element
foreach (var element in mh.elements) {
//Loop through each chunk in each element
foreach (var chunk in element.Chunks) {
//Do something
}
}