问题
I'm using PHPExcel for transferring data between MySQL DB and Excel 2007 worksheets. It works well on most situations, but I encountered one problem.
Some of the fields in DB contain HTML data. I need to preserve the formatting in Excel cells as much as possible. As I could figure out, Excel allows the following formatting inside cells (PHPExcel_RichText
class supports all of these): new lines [these can be used to track <p></p>
blocks], font name, size, color, bold, italic, underline, strikethrough, subscript, superscript. Suppose these are enough, so we can ignore other HTML formatting.
What is the best (easiest, fastest) way to convert HTML data to Excel Rich Text and vice versa?
One solution I've in mind is to create a function that will traverse the HTML [using DOMDocument
or so], place \n
after block elements, create PHPExcel_RichText_Run
objects for <b>
, <i>
etc, and ignore all other elements. I feel this will be quite "expensive", especially when dealing with nested structures, like <b>some <i>formatted<i> text</b>
Is there any better way to do this, with or without PHPExcel
?
One more idea: I noticed that when exporting in XML Spreadsheet 2003 format the following appears inside XML:
<ss:Data ss:Type="String"
xmlns="http://www.w3.org/TR/REC-html40"><Font html:Color="#000000">this is </Font><B><Font
html:Color="#000000">some </Font><I><Font html:Color="#000000">formatted</Font></I><Font
html:Color="#000000"> text</Font></B></ss:Data>
which is normal HTML4. I mean it seems that Excel can understand plain HTML. So maybe there is some way to pass HTML directly to Excel without converting it to PHPExcel_RichText
objects... (although note that it would be best if I'll be able to export to .xlsx format)
回答1:
HTML to Rich-Text Runs is on the PHPExcel development roadmap for the coming year: however, the planned method was to use DOMDocument to parse the markup.
Any solution that we adopt for PHPExcel itself will have to use RichText Runs to provide consistency. While MS Excel itself can handle direct imports of Excel, and (as you've noted in the SpreadSheetML xml format offered by Excel 2003), this isn't consistent across the other different Excel formats (BIFF and OfficeOpenXML).
来源:https://stackoverflow.com/questions/9044708/convert-html-to-excel-rich-text-and-vice-versa