sax | 易学教程

Obtaining DOCTYPE details using SAX (JDK 7)

阅读更多关于 Obtaining DOCTYPE details using SAX (JDK 7)

问题 I'm using the SAX parser that comes with JDK7. I'm trying to get hold of the DOCTYPE declaration, but none of the methods in DefaultHandler seem to be fired for it. What am I missing? import java.io.StringReader; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.InputSource; import org.xml.sax.SAXException; import org.xml.sax.helpers.DefaultHandler; public class Problem { public static void main(String[] args)

Lazy SAX XML parser with stop/resume

阅读更多关于 Lazy SAX XML parser with stop/resume

问题 I am pretty sure the answer is no but of course there are cleverer guys than me! Is there a way to construct a lazy SAX based XML parser that can be stopped (e.g. raising an exception is a possible way of doing this) but also resumable ? I am looking for a possible solution for Python >= 2.6 with standard XML libraries. The "lazy" part is also trivial: I am really after the "resumable" property here. 回答1: Expat can be stopped and is resumable. AFAIK Python SAX parser uses Expat. Does the API

Parsing XML with SAX/Python + no validation

阅读更多关于 Parsing XML with SAX/Python + no validation

问题 I am new to python and I'm trying to parse a XML file with SAX without validating it. The head of my xml file is: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE n:document SYSTEM "schema.dtd"> <n:document.... and I've tried to parse it with python 2.5.2: from xml.sax import make_parser, handler import sys parser = make_parser() parser.setFeature(handler.feature_namespaces,True) parser.setFeature(handler.feature_validation,False) parser.setContentHandler(handler.ContentHandler()) parser

Android : Cdata in xml not parsing correctly(sax)

阅读更多关于 Android : Cdata in xml not parsing correctly(sax)

问题 I'm new to android development, and im trying to create an app that is pulling xml from a web service. all the rest of the xml pulls in fine, apart from the bits in the Cdata tag, instead of pulling in a large ammount of html text, it pulls in " <p> ". Could someone point me in the right direction as to why its not pulling in properly here is my xml <articles> <article> <name>AndroidPeople</name> <headline>notandroidpeople</headline> <website category="android">www.androidpeople.com</website>

Batik with grails giving sax clash

阅读更多关于 Batik with grails giving sax clash

问题 I'm trying to use batik with grails to render some SVG stuff to PNG on the server. I'm getting the following error in IntelliJ when I add the dependencies to BuildConfig and then tell IntelliJ to load the changes: /Library/Java/JavaVirtualMachines/1.6.0_33-b03-424.jdk/Contents/Home/bin/java -Dgrails.home=/Applications/Dev/grails-2.1.0 -Dbase.dir=/Users/greg/Documents/development/git/liftyourgame-grails/webapp -Dtools.jar=/Library/Java/JavaVirtualMachines/1.6.0_33-b03-424.jdk/Contents/Home/lib

python reporting line/column of origin of XML node

阅读更多关于 python reporting line/column of origin of XML node

问题 I'm currently using xml.dom.minidom to parse some XML in python. After parsing, I'm doing some reporting on the content, and would like to report the line (and column) where the tag started in the source XML document, but I don't see how that's possible. I'd like to stick with xml.dom / xml.dom.minidom if possible, but if I need to use a SAX parser to get the origin info, I can do that -- ideal in that case would be using SAX to track node location, but still end up with a DOM for my post

Android REST XML result to Listview

阅读更多关于 Android REST XML result to Listview

问题 I have a REST web service that returns an xml result like this: - <MyCategories xmlns="http://schemas.datacontract.org/2004/07/ceva" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> - <Category> <CategoryName>First category</CategoryName> <Id>1</Id> </Category> - <Category> <CategoryName>Second category</CategoryName> <Id>2</Id> </Category> - <Category> <CategoryName>Third category</CategoryName> <Id>3</Id> </Category> </MyCategories> I acces the web service like this: HttpClient

Java: splitting up a large XML file with SAXParser

阅读更多关于 Java: splitting up a large XML file with SAXParser

问题 I am trying to split a large XML file into smaller files using java's SAXParser (specifically the wikipedia dump which is about 28GB uncompressed). I have a Pagehandler class which extends DefaultHandler : private class PageHandler extends DefaultHandler { private StringBuffer text; ... @Override public void startElement(String uri, String localName, String qName, Attributes attributes) { text.append("<" + qName + ">"); } @Override public void endElement(String uri, String localName, String

Using a template with OpenXML and SAX

阅读更多关于 Using a template with OpenXML and SAX

问题 I'm creating a large XLSX file from a datatable, using the SAX method proposed in Parsing and Reading Large Excel Files with the Open XML SDK. I'm using an XLSX file as a template. The method described in that post works fine to substitute a new sheet in for an existing one, but I want to copy the header row from the sheet in the template (string values, formatting, etc), instead of just using the header row from the datatable as the original code does. I've tried the code below, but the XLSX

Is there a fast XML parser in Python that allows me to get start of tag as byte offset in stream?

阅读更多关于 Is there a fast XML parser in Python that allows me to get start of tag as byte offset in stream?

问题 I am working with potentially huge XML files containing complex trace information from on of my projects. I would like to build indexes for those XML files so that one can quickly find sub sections of the XML document without having to load it all into memory. If I have created a "shelve" index that could contains information like "books for author Joe" are at offsets [22322, 35446, 54545] then I can just open the xml file like a regular text file and seek to those offsets and then had that