Project Description
Normalizes HTML into XML that can be parsed and manipulated.


HTML can be really ugly. Even valid HTML can be (and most often is) invalid XML. This small, fast little library is able to parse the HTML tree and create XML that can be read into System.Xml.XmlDocument or System.XML.Linq.XDocument.

Consider the following ugly HTML:
     <p>First paragraph
     <p style=color:red>Second paragraph


Running this command:
XHTML.ToXml(html, XHTML.Options.Default | XHTML.Options.Pretty);

Will produce the following XML:
<html>
  <body>
      <p>First paragraph
                    </p>
      <p style="color:red">Second paragraph</p>
  </body>
</html>

Last edited Jan 14, 2011 at 4:32 PM by andyedinborough, version 2