hxt

HXT getting first element: refactor weird arrow

我是研究僧i 提交于 2020-01-15 06:43:23
问题 I need to get text contents of first <p> which is children of <div class="about"> , wrote the following code: tagTextS :: IOSArrow XmlTree String tagTextS = getChildren >>> getText >>> arr stripString parseDescription :: IOSArrow XmlTree String parseDescription = ( deep (isElem >>> hasName "div" >>> hasAttrValue "id" (== "company_about_full_description")) >>> (arr (\x -> x) /> isElem >>> hasName "p") >. (!! 0) >>> tagTextS ) `orElse` (constA "") Look at this arr (\x -> x) – without it I wasn

Transform nodes with HXT using the number of <section> ancestor nodes

拟墨画扇 提交于 2020-01-04 13:28:14
问题 I'm looking to replace all title elements with h1 , h2 , ... , h6 elements depending on how many ancestors are section elements. Example input/output: Input.xml <document> <section> <title>Title A</title> <section> <title>Title B</title> </section> <section> <title>Title C</title> <section> <title>Title D</title> </section> </section> </section> </document> Output.xml <document> <section> <h1>Title A</h1> <section> <h2>Title B</h2> </section> <section> <h2>Title C</h2> <section> <h3>Title D<

Transform nodes with HXT using the number of <section> ancestor nodes

久未见 提交于 2020-01-04 13:27:58
问题 I'm looking to replace all title elements with h1 , h2 , ... , h6 elements depending on how many ancestors are section elements. Example input/output: Input.xml <document> <section> <title>Title A</title> <section> <title>Title B</title> </section> <section> <title>Title C</title> <section> <title>Title D</title> </section> </section> </section> </document> Output.xml <document> <section> <h1>Title A</h1> <section> <h2>Title B</h2> </section> <section> <h2>Title C</h2> <section> <h3>Title D<

HXT: Select a node by position with HXT in Haskell?

喜夏-厌秋 提交于 2019-12-23 13:32:30
问题 I’m trying to parse some XML files with Haskell. For this job I’m using HXT to get some knowledge about arrows in real world applications. So I’m quite new to the arrow topics. In XPath (and HaXml) it’s possible to select a node by position, let’s say: /root/a[2]/b I can’t figure out how to do something like that with HXT, even after reading the documentation again and again. Here is some sample code I’m working with: module Main where import Text.XML.HXT.Core testXml :: String testXml =

Extracting Values from a Subtree

╄→尐↘猪︶ㄣ 提交于 2019-12-21 17:31:15
问题 I am parsing an XML file with HXT and I am trying to break up some of the node extraction into modular pieces (I have been using this as my guide). Unfortunately, I cannot figure out how to apply some of the selectors once I do the first level parsing. import Text.XML.HXT.Core let node tag = multi (hasName tag) xml <- readFile "test.xml" let doc = readString [withValidate yes, withParseHTML no, withWarnings no] xml books <- runX $ doc >>> node "book" I see that books has a type [XmlTree] :t

HXT: Surprising behavior when reading and writing HTML to String in pure code

纵饮孤独 提交于 2019-12-19 07:55:51
问题 I want to read HTML from a String, process it and return the changed document as a String using HXT. As this operation does not require IO, I would rather execute the Arrow with runLA than with runX . The code look like this (omitting the processing for simplicity): runLA (hread >>> writeDocumentToString [withOutputHTML, withIndent yes]) html However, the surrounding html tag is missing in the result: ["\n <head>\n <title>Bogus</title>\n </head>\n <body>\n Some trivial bogus text.\n </body>\n

Haskell HXT for extracting a list of values

烂漫一生 提交于 2019-12-19 03:24:05
问题 I'm trying to figure my way through HXT with XPath and arrows at the same time and I'm completely stuck on how to think through this problem. I've got the following HTML: <div> <div class="c1">a</div> <div class="c2">b</div> <div class="c3">123</div> <div class="c4">234</div> </div> which I've extracted into an HXT XmlTree. What I'd like to do is define a function (I think?): getValues :: [String] -> IOSArrow Xmltree [(String, String)] Which, if used as getValues ["c1", "c2", "c3", "c4"] ,

Is factoring an arrow out of arrow do notation a valid transformation?

巧了我就是萌 提交于 2019-12-14 03:57:56
问题 I'm trying to get my head around HXT, a Haskell library for parsing XML that uses arrows. For my specific use case I'd rather not use deep as there are cases where <outer_tag><payload_tag>value</payload_tag></outer_tag> is distinct from <outer_tag><inner_tag><payload_tag>value</payload_tag></inner_tag></outer_tag> but I ran into some weirdness that felt like it should work but doesn't. I've managed to come up with a test case based on this example from the docs: {-# LANGUAGE Arrows,

How do i output XMLTrees in HXT?

元气小坏坏 提交于 2019-12-12 10:02:23
问题 I am trying to extract tags from a xml file and write each one to a seperate file based on an attribute. The extraction part isn't that hard: *Main> ifs <- runX ( readDocument [withCurl [],withExpat yes] "file.xml" >>> getElement "TagName" >>> getAttrValue "Name" &&& this) *Main> :t ifs ifs :: [(String, XmlTree)] I tried to map writeDocument over the second entries but had no sucess. I understand that i have to get it back into the IO Monad somehow ... but have no idea on how to achieve this.

HXT: Can an input change with the arrow syntax?

醉酒当歌 提交于 2019-12-11 09:24:50
问题 With the following code {-# LANGUAGE Arrows #-} {-# LANGUAGE NoMonomorphismRestriction #-} import Text.XML.HXT.Core parseXml :: IOSArrow XmlTree XmlTree parseXml = getChildren >>> getChildren >>> proc x -> do y <- x >- hasName "item" returnA -< x main :: IO () main = do person <- runX (readString [withValidate no] "<xml><item>John</item><item2>Smith</item2></xml>" >>> parseXml) putStrLn $ show person return () I get the output [NTree (XTag "item" []) [NTree (XText "John") []]] So it seems