HXT: Surprising behavior when reading and writing HTML to String in pure code

纵饮孤独 提交于 2019-12-19 07:55:51

问题


I want to read HTML from a String, process it and return the changed document as a String using HXT. As this operation does not require IO, I would rather execute the Arrow with runLA than with runX.

The code look like this (omitting the processing for simplicity):

runLA (hread >>> writeDocumentToString [withOutputHTML, withIndent yes]) html

However, the surrounding html tag is missing in the result:

["\n  <head>\n    <title>Bogus</title>\n  </head>\n  <body>\n        Some trivial bogus text.\n    </body>\n",""]

When I use runX instead like this:

runX (readString [] html >>> writeDocumentToString [withOutputHTML, withIndent yes])

I get the expected result:

["<html>\n  <head>\n    <title>Bogus</title>\n  </head>\n  <body>\n        Some trivial bogus text.\n    </body>\n</html>\n"]

Why is that, and how can I fix it?


回答1:


If you look at the XmlTrees for both, you'll see that readString adds a top-level "/" element. For the non-IO runLA version:

> putStr . formatTree show . head $ runLA xread html
---XTag "html" []
   |
   +---XText "\n  "
   |
   +---XTag "head" []
   ...

And with runX:

> putStr . formatTree show . head =<< runX (readString [] html)
---XTag "/" [NTree (XAttr "transfer-Status") [NTree (XText "200")...
   |
   +---XTag "html" []
       |
       +---XText "\n  "
       |
       +---XTag "head" []
       ...

writeDocumentToString uses getChildren to strip off this root element.

One easy way around this is to use something like selem to wrap the output of xread in a similar root element, in order to make it look like the kind of input writeDocumentToString expects:

> runLA (selem "/" [xread] >>> writeDocumentToString [withOutputHTML, withIndent yes]) html
["<html>\n  <head>\n    <title>Bogus</title>\n  </head>\n  <body>\n        Some trivial bogus text.\n    </body>\n</html>\n"]

This produces the desired output.



来源:https://stackoverflow.com/questions/7208559/hxt-surprising-behavior-when-reading-and-writing-html-to-string-in-pure-code

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!