问题
I have a log file (log4j xml-ish format) that I am trying to pull info out of and use in my Python module. Could I treat this file as if it were XML? My gut is telling me no... If not, what is the best way to parse the data? Below is a section of the log file. The file does not include your standard doctype or version headers which is why I said "xml-ish."
<log4j:event
logger="com.hp.cp.elk.impl.subscriptions.AsyncSimpleSubscriptionManager"
timestamp="1352320517430" level="DEBUG" thread="Thread-77">
<log4j:message><![CDATA[Broadcasting signals to subscribers...]]></log4j:message>
</log4j:event>
<log4j:event logger="com.hp.cp.jdf.idp.queue.IDPJobProgressMonitor"
timestamp="1352320517430" level="DEBUG" thread="IDPJobProgressMonitorThread">
<log4j:message><![CDATA[[JDFQueueEntry[ --> JDFAutoQueueEntry[ --> JDFElement[
--> <?xml version="1.0" encoding="UTF-8"?><QueueEntry
xmlns="http://www.CIP4.org/JDFSchema_1_1"
DescriptiveName="H44E61-6.pdf" DeviceID="HPPRO1-SM1"
EndTime="2012-11-07T10:58:18-08:00" JobID="Default" Priority="50"
QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb" Status="Completed"
SubmissionTime="2012-11-06T16:35:06-08:00"> <Comment AgentName="CIP4 JDF Writer
Java" AgentVersion="1.4a BLD 63" ID="c_121106_163506894_000804"
Name="JobSpec">WBG_4C_Flat_21up_BusCards_Duplex</Comment>
</QueueEntry>
] ] ]] queue entries changed.]]></log4j:message>
</log4j:event>
<log4j:event logger="com.hp.cp.jdf.idp.queue.IDPJobProgressMonitor"
timestamp="1352320517430" level="DEBUG" thread="IDPJobProgressMonitorThread">
<log4j:message><![CDATA[no active queue entries changed.]]></log4j:message>
</log4j:event>
Sorry for the messy code, I just wanted to make you all can get an idea of the formatting. Anyway, I'm currently just trying to pull the value from QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb" Any suggestions? Thank you!
回答1:
I would imagine that you could use standard XML tools like DOM or SAX to parse this. Otherwise, have fun with re or htmllib.
来源:https://stackoverflow.com/questions/13278405/parsing-log4j-in-python