Create Pandas DataFrame from txt file with specific pattern

前端 未结 6 2438
北海茫月
北海茫月 2020-11-22 09:04

I need to create a Pandas DataFrame based on a text file based on the following structure:

Alabama[edit]
Auburn (Aubu         


        
6条回答
  •  轮回少年
    2020-11-22 09:27

    You could parse the file into tuples first:

    import pandas as pd
    from collections import namedtuple
    
    Item = namedtuple('Item', 'state area')
    items = []
    
    with open('unis.txt') as f: 
        for line in f:
            l = line.rstrip('\n') 
            if l.endswith('[edit]'):
                state = l.rstrip('[edit]')
            else:            
                i = l.index(' (')
                area = l[:i]
                items.append(Item(state, area))
    
    df = pd.DataFrame.from_records(items, columns=['State', 'Area'])
    
    print df
    

    output:

          State          Area
    0   Alabama        Auburn
    1   Alabama      Florence
    2   Alabama  Jacksonville
    3   Alabama    Livingston
    4   Alabama    Montevallo
    5   Alabama          Troy
    6   Alabama    Tuscaloosa
    7   Alabama      Tuskegee
    8    Alaska     Fairbanks
    9   Arizona     Flagstaff
    10  Arizona         Tempe
    11  Arizona        Tucson
    

提交回复
热议问题