sax

Using SAX with JAXBContext

泄露秘密 提交于 2019-12-06 12:45:56
问题 I am trying deserialize XML data into newly created Java content trees: I am using SAX, have the Java class under src\main\java\Summaries.java and am trying to simply print the document extracted: String xmlPath = "C:\\workspace-sts-2.8.0.RELEASE\\RESTClient\\src\\main\\resources\\dailySummary.xml"; String xmlPath2 = "/dailySummary.xml"; String xmlPath3 = "/src/main/resources/dailySummary.xml"; InputStream inputStream = null; InputSource inputSource = null; Unmarshaller unmarshaller = null;

Obtain InputStream from XML element content

♀尐吖头ヾ 提交于 2019-12-06 11:58:10
问题 My servlet's doPost() receives an HttpServletRequest whose ServletInputStream sends me a large chunk of uuencoded data wrapped in XML. E.g., there is an element: <filedata encoding="base64">largeChunkEncodedHere</filedata> I need to decode the chunk and write it to a file. I would like to get an InputStream from the chunk, decode it as a stream using MimeUtility, and use that stream to write the file---I would prefer not to read this large chunk into memory. The XML is flat; i.e., there is

JAVA SAX DefaultHandler startCDATA() not firing

一个人想着一个人 提交于 2019-12-06 11:07:25
I am trying to parse and detect the start of the CDATA within a tag like: <child><![CDATA[data goes here]]></child> I have a class that extends the Default handler class MyXmlDoc extends DefaultHandler{ with methods for startElement() and endElement() that fire correctly but the startCDATA() never fires. My characters() method picks up the 'data goes here' so it appears that the CDATA 'wrapper' is detected but ??? Thanks for any insight! CDATA is a lexical event. Regular handlers (content handler, error handler) do not process these events. You need to set a lexical handler for your reader, if

Parsing XML with Python xml.sax: How does one “keep track” of where in the tree you are?

随声附和 提交于 2019-12-06 09:20:12
问题 I need to regularly export XML files from our Administration software. This is the first time I'm using XML Parsing in Python. The XML with xml.sax isn't terribly difficult, but what is the best way to "keep track" of where in the XML tree you are? For example, I have a list of our customers. I want to extract the Telephone by , but there are multiple places where occurs: eExact -> Accounts -> Account -> Contacts -> Contact -> Addresses -> Address -> Phone eExact -> Accounts -> Account ->

iOS: Combining SAX and DOM parsing

僤鯓⒐⒋嵵緔 提交于 2019-12-06 08:12:43
I am currently working on an iPad project for which I need to process large XML file into an SQLite backend. I currently have this working using the TBXML parser. So all the logic is in place and in general the TBXML parser does the job it needs to do. Only problem I'm now encountering is that the XML files are getting too big and I am running out of memory. Because of this I thinking of switching to a SAX parser like the core NSXMLParser of something like Alan Quatermain's AQXMLParser . However this will require me to redo all of my current logic that to some extent relies on functions

StAX - reading base64 string from xml into db

二次信任 提交于 2019-12-06 07:42:04
I'm using StAX to read my file, which has some Base64 data in it, and saving it into the db using Hibernate. XML: <root> <base64>lololencoded12</base64> <base64>encodedlolos32</base64> ............................... </root> Code to read and save: xmlif = (XMLInputFactory2) XMLInputFactory2.newInstance(); xmlif.setProperty(XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES, Boolean.FALSE); xmlif.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, Boolean.FALSE); xmlif.setProperty(XMLInputFactory.IS_COALESCING, Boolean.FALSE); xmlif.configureForLowMemUsage(); List<Entity> entities = new

How do I call DaisyDiff to compare two HTML files?

青春壹個敷衍的年華 提交于 2019-12-06 06:55:44
问题 I need to create a diff between two HTML documents in my app. I found a library called DaisyDiff that can do it. It has an API that looks like this: /** * Diffs two html files, outputting the result to the specified consumer. */ public static void diffHTML(InputSource oldSource, InputSource newSource, ContentHandler consumer, String prefix, Locale locale) throws SAXException, IOException I know absolutely nothing about SAX and I can't figure out what to pass as the third argument. After

How to return data from a Python SAX parser?

强颜欢笑 提交于 2019-12-06 03:31:20
问题 I've been trying to parse some huge XML files that LXML won't grok, so I'm forced to parse them with xml.sax. class SpamExtractor(sax.ContentHandler): def startElement(self, name, attrs): if name == "spam": print("We found a spam!") # now what? The problem is that I don't understand how to actually return , or better, yield , the things that this handler finds to the caller, without waiting for the entire file to be parsed. So far, I've been messing around with threading.Thread and Queue

SAX XML Java Entities problem

旧城冷巷雨未停 提交于 2019-12-06 01:36:37
I've a problem with SAX and Java . I'm parsing the dblp digital library database xml file (which enumerates journal, conferences, paper). The XML file is very large (> 700MB). However, my problem is that when the callback characters() returns, if the string retrieved contains several entities , the method only returns the string starting from the last entity characters found . i.e.: Rüdiger Mecke is the original author name held between <author> tags üdiger Mecke is the result (The String returned from characters (ch[], start, length) method). I would like to know: how to prevent the PArser to

Parsing html with SAX parser

Deadly 提交于 2019-12-05 23:16:34
问题 I am trying to parse the normal html file using SAX parser. SAXBuilder builder2 = new SAXBuilder(); try { Document sdoc = (Document)builder2.build(readFile); NodeList nl=sdoc.getElementsByTagName("body"); System.out.println("nodelist>>>>>>>>>>>"+nl.getLength()); } catch (JDOMException e1) { e1.printStackTrace(); } but i am getting the exception Open quote is expected for attribute "{1}" associated with an element type "class". can anyone please tell me why i am getting this exception, the