sax

How to ignore inline DTD when parsing XML file in Java

。_饼干妹妹 提交于 2019-12-05 13:29:33
I have a problem reading a XML file with DTD declaration inside (external declaration is solved). I'm using SAX method (javax.xml.parsers.SAXParser). When there is no DTD definition parsing looks like for example StartEement-Characters-StartElement-Characters-EndElement-Characters...... So there is characters method called immediately after Start or End element and thats how I need it to be. When DTD is in file parsing schema changes to for example StartElement-StartElement-StartElement-Characters-EndEement-EndEement-EndEement. And I need Characters method after every element. So I'm asking is

Can't read some attributes with SAX

半城伤御伤魂 提交于 2019-12-05 13:19:34
I'm trying to parse that document with SAX: <scxml version="1.0" initialstate="start" name="calc"> <datamodel> <data id="expr" expr="0" /> <data id="res" expr="0" /> </datamodel> <state id="start"> <transition event="OPER" target="opEntered" /> <transition event="DIGIT" target="operand" /> </state> <state id="operand"> <transition event="OPER" target="opEntered" /> <transition event="DIGIT" /> </state> </scxml> I read all the attributes well, except "initialstate" and "name"... I get the attributes with the startElement handler, but the size of the attribute list for scxml is zero. Why? How I

Using SAX (Java) to parse multiple XML messages from a single TCP-stream

五迷三道 提交于 2019-12-05 07:49:07
I'm in a position where I use Java to connect to a TCP port and am streamed XML documents one after another, each delimited with the <?xml start of document tag. An example which demonstrates the format: <?xml version="1.0"?> <person> <name>Fred Bloggs</name> </person> <?xml version="1.0"?> <person> <name>Peter Jones</name> </person> I'm using the org.xml.sax.* api. The SAX parsing works perfectly for the first document but throws an exception when it comes across the start of the second document: Exception in thread "main" org.xml.sax.SAXParseException: The processing instruction target

How can I process xml asynchronously in python?

余生长醉 提交于 2019-12-05 06:04:35
I have a large XML data file (>160M) to process, and it seems like SAX/expat/pulldom parsing is the way to go. I'd like to have a thread that sifts through the nodes and pushes nodes to be processed onto a queue, and then other worker threads pull the next available node off the queue and process it. I have the following (it should have locks, I know - it will, later) import sys, time import xml.parsers.expat import threading q = [] def start_handler(name, attrs): q.append(name) def do_expat(): p = xml.parsers.expat.ParserCreate() p.StartElementHandler = start_handler p.buffer_text = True

Java SAX Parser raises UnknownHostException

独自空忆成欢 提交于 2019-12-05 05:36:04
The XML file I want to parse starts with : <!DOCTYPE plist PUBLIC "-//...//DTD PLIST 1.0//EN" "http://www.....dtd"> So when I start the SAX praser, it tries to access this DTD online, and I get a java.net.UnknownHostException. I cannot modify the XML file before feeding it to the SAX parser I have to run even with no internet connection How can I change the SAX Parser behaviour so that it does not try to load the DTD ? Thanks. javax.xml.parsers.SAXParserFactory factory = javax.xml.parsers.SAXParserFactory.newInstance(); factory.setValidating(false); javax.xml.parsers.SAXParser parser = factory

XML Validation in Java: processContents=“lax” seems not to work correctly

大憨熊 提交于 2019-12-05 00:35:29
问题 I have an XML Schema which contains a number of <any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded" /> definitions, i.e., it allows to insert arbitrary tags of other namespaces. processContents="lax" indicates that the parser should try do validate these tags, if it has the according schema (1) (2). For me this means, that if I give the parser all schema documents, and there is an invalid XML tag of one of the secondary namespaces, it needs to report an error.

Android REST XML result to Listview

不羁岁月 提交于 2019-12-04 22:17:39
I have a REST web service that returns an xml result like this: - <MyCategories xmlns="http://schemas.datacontract.org/2004/07/ceva" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> - <Category> <CategoryName>First category</CategoryName> <Id>1</Id> </Category> - <Category> <CategoryName>Second category</CategoryName> <Id>2</Id> </Category> - <Category> <CategoryName>Third category</CategoryName> <Id>3</Id> </Category> </MyCategories> I acces the web service like this: HttpClient httpclient = new DefaultHttpClient(); HttpGet request = new HttpGet(WebServiceURL); request.addHeader("deviceId

Android: SaxParser problems using ISO-8859-1 encoding

不想你离开。 提交于 2019-12-04 19:46:57
Im facing some problems on xml parsing with android. The problem is that the xml from the server comes in "ISO-8859-1" set with setEncoding (i get <?xml version="1.0" encoding="ISO-8859-1"?> ) format and the android device seems that its ignoring that encoding. For example this is part of the original xml that comes from the server: <Result Filename="Pautas para la Presentación RUP Iteraciones de Construcción.ppt"> <Path>C:\Documents and Settings\zashael\My Documents\PFC\RUP\Pautas para la Presentación RUP Iteraciones de Construcción.ppt</Path> <Hostname>computer_1</Hostname> <IP>192.168.0.5

SAX Parser and XML Schema (XSD) validation

旧街凉风 提交于 2019-12-04 19:27:30
Which Java XML libraries can do SAX-based parsing and validation against an XML Schema (XSD) at the same time? I'm really looking for the most efficient solution for reading and validating large XML files. bonus points if you can provide sample code. Xerces which is part of the Sun JDK can do validation. You can see in the docs . As suggested by Robert , Xerces provides SAX-based parsing and XSD validation. You can checkout example at : http://www.herongyang.com/XML-Schema/Xerces2-XSD-Validation-with-SAXParser.html 来源: https://stackoverflow.com/questions/2417512/sax-parser-and-xml-schema-xsd

Use CSS selectors to collect HTML elements from a streaming parser (e.g. SAX stream)

折月煮酒 提交于 2019-12-04 19:05:51
问题 How to parse CSS (CSS3) selector and use it (in jQuery-like way) to collect HTML elements not from DOM (from tree structure), but from stream (e.g. SAX), i.e. using sequential access event based parser? By the way, are there any CSS selectors (or their combination) that need access to DOM (Wikipedia SAX page says that XPath selectors "need to be able to access any node at any time in the parsed XML tree")? I am most interested in implementing selector combinators , e.g. 'A B' descendant