I am completely new to python and I have a serious problem which I cannot solve.
I have a few log files with identical structure:
[timestamp] [level]
First off, you will want to use the fileinput
module for getting data from multiple files, like:
data = fileinput.FileInput()
for line in data.readlines():
print line
Which will then print all of the lines together. You also want to sort, which you can do with the sorted keyword.
Assuming your lines had started with [2011-07-20 19:20:12]
, you're golden, as that format doesn't need any sorting above and beyond alphanum, so do:
data = fileinput.FileInput()
for line in sorted(data.readlines()):
print line
As, however, you have something more complex you need to do:
def compareDates(line1, line2):
# parse the date here into datetime objects
NotImplemented
# Then use those for the sorting
return cmp(parseddate1, parseddate2)
data = fileinput.FileInput()
for line in sorted(data.readlines(), cmp=compareDates):
print line
For bonus points, you can even do
data = fileinput.FileInput(openhook=fileinput.hook_compressed)
which will enable you to read in gzipped log files.
The usage would then be:
$ python yourscript.py access.log.1 access.log.*.gz
or similar.