问题
I have to deal with problems in densely formatted HTML which is effectively unreadable, so I want a library to 'pretty print', format, beautify or whatever you want to call it within the .NET application that's managing this HTML.
At the moment I copy and paste it into Visual Studio 2012 and format it in that then paste it back into the application, but that's becoming a bit tedious.
It would also be handy if it could effectively reverse the process and strip out all the white space when I've fixed the problems.
Incidentally I'm aware that changing the format of HTML can sometimes lead to unexpected results (I'm looking at you IE), but I can live with that.
回答1:
Check out Html Tidy for .NET/Mono
From the project page:
TidyManaged
This is a managed .NET/Mono wrapper for the open source, cross-platform Tidy library, a HTML/XHTML/XML markup parser & cleaner originally created by Dave Raggett.
And sample usage:
using System;
using TidyManaged;
public class Test
{
public static void Main(string[] args)
{
using (Document doc = Document.FromString("<hTml><title>test</tootle><body>asd</body>"))
{
doc.ShowWarnings = false;
doc.Quiet = true;
doc.OutputXhtml = true;
doc.CleanAndRepair();
string parsed = doc.Save();
Console.WriteLine(parsed);
}
}
}
Looks like it should meet your needs perfectly.
来源:https://stackoverflow.com/questions/15120887/looking-for-an-offline-library-to-format-html-that-i-can-use-with-net-code