Parsing a lisp file with Python

后端 未结 4 669
你的背包
你的背包 2021-01-02 15:02

I have the following lisp file, which is from the UCI machine learning database. I would like to convert it into a flat text file using python. A typical line looks like th

4条回答
  •  难免孤独
    2021-01-02 15:51

    Separate it into pairs with a regular expression:

    In [1]: import re
    
    In [2]: txt = '(((st 8) (pitch 67) (dur 4) (keysig 1) (timesig 12) (fermata 0))((st 12) (pitch 67) (dur 8) (keysig 1) (timesig 12) (fermata 0)))'
    
    In [3]: [p.split() for p in re.findall('\w+\s+\d+', txt)]
    Out[3]: [['st', '8'], ['pitch', '67'], ['dur', '4'], ['keysig', '1'], ['timesig', '12'], ['fermata', '0'], ['st', '12'], ['pitch', '67'], ['dur', '8'], ['keysig', '1'], ['timesig', '12'], ['fermata', '0']]
    

    Then make it into a dictionary:

    dct = {}
    for p in data:
        if not p[0] in dct.keys():
            dct[p[0]] = [p[1]]
        else:
            dct[p[0]].append(p[1])
    

    The result:

    In [10]: dct
    Out[10]: {'timesig': ['12', '12'], 'keysig': ['1', '1'], 'st': ['8', '12'], 'pitch': ['67', '67'], 'dur': ['4', '8'], 'fermata': ['0', '0']}
    

    Printing:

    print 'time pitch duration keysig timesig fermata'
    for t in range(len(dct['st'])):
        print dct['st'][t], dct['pitch'][t], dct['dur'][t], 
        print dct['keysig'][t], dct['timesig'][t], dct['fermata'][t]
    

    Proper formatting is left as an exercise for the reader...

提交回复
热议问题