XSLT: CSV (or Flat File, or Plain Text) to XML

痴心易碎 提交于 2020-01-13 19:00:48

问题


I am trying to convert plain text files to XML files using XSLT. I started with CSV files, because that is a well-known file format that I could start Googling examples on.

I stumbled onto this: http://ajwelch.blogspot.com/2007/02/csv-to-xml-converter-in-xslt-20.html, which also points at http://andrewjwelch.com/code/xslt/csv/csv-to-xml_v2.html.

Those links contain what is, supposedly, an XSLT (2.0) that can take a CSV file and convert it to an XML file.

...Except it doesn't actually work.

I set it up in my Maven Eclipse project, downloaded the latest Saxon dependency (9.4 HE) and tried to use it. I was met with this error:

Error on line 1 column 1 of csv.csv:

SXXP0003: Error reported by XML parser: Content is not allowed in prolog.

That seems to indicate to me that when it began parsing the file, it hit the first character, found it wasn't a < character, exclaimed to itself "This isn't an XML file! Double-yew tee eff, mate!" and blew up. Which kind of runs contrary to idea that this XSLT is supposed to work on files that are not XML (namely, CSV files instead). Forcing you to wrap your non-XML in an XML tag to have it work completely defeats the purpose.

At first I thought maybe the problem was that I wasn't using the Saxon jar directly on the command line like the example. So I did just that. The result was something quite familiar:

Error on line 1 column 1 of csv.csv:

SXXP0003: Error reported by XML parser: Content is not allowed in prolog.

I thought that perhaps since I was using a newer version, I needed to go back and use the version that the example was originally written under. So I went back to SaxonB 9.1.0.8 and tried it both in Eclipse and on the command line. Care to guess what happened?

Error on line 1 column 1 of csv.csv:

SXXP0003: Error reported by XML parser: Content is not allowed in prolog.

I discovered that if I wrap the entire contents of the CSV file in a dummy xml tag (e.g. <whatever>item1,item2,item3</whatever>) it starts to almost work (it at least makes it past the first character and I start to get a different error farther along in the process).

So why the hell doesn't this XSLT work? Why does the blog its posted on (and all of the attendant comments in the attached comment section) seem to indicate that it does? I also found it referenced here in the Ubuntu help documentation, and as the accepted answer on this StackOverflow question. How is that possible? It doesn't work!

So either everyone on the entire Internet is lying to each other and/or themselves in a giant conspiracy designed to enrage me, or there is some very simple, integral step I am just missing that is required to make Saxon use that XSLT to convert a CSV file to an XML file.

So, anybody know which it is?

Edit: pgfearo's answer accepted. The original contents of this "Edit" section is now it's own question here: Saxon in Java: XSLT for CSV to XML

Edit 2: If anyone is curious as to what my XSLT ended up looking like, that ended up in a different question here: XSLT remove() function


回答1:


I don't think it's a conspiracy - you haven't included the Saxon command line you used but I suspect you're calling the transform with csv.csv as the source of the transform. Because this isn't an XML file you will get an XML parser error such as you've shown.

The XSLT stylesheet you reference has an entry template called 'main', use the -it option on the command line to set 'main' as the initial template. With this set, you now don't need to supply a source for the transform. The Saxon command line options are documented here.



来源:https://stackoverflow.com/questions/10655762/xslt-csv-or-flat-file-or-plain-text-to-xml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!