Extract columns from Excel using Python

前端 未结 3 1832
猫巷女王i
猫巷女王i 2020-12-18 13:05

I have an Excel file with the ff: row/col structure

ID   English   Spanish   French
 1   Hello     Hilo      Halu
 2   Hi        Hye       Ghi
 3   Bus               


        
相关标签:
3条回答
  • 2020-12-18 13:17
    import xlrd
    
    sh = xlrd.open_workbook('input.xls').sheet_by_index(0)
    english = open("english.txt", 'w')
    spanish = open("spanish.txt", 'w')
    french = open("french.txt", 'w')
    try:
        for rownum in range(sh.nrows):
            english.write(str(rownum)+ " = " +str(sh.cell(rownum, 0).value)+"\n")
            spanish.write(str(rownum)+ " = " +str(sh.cell(rownum, 1).value)+"\n")
            french.write(str(rownum)+ " = " +str(sh.cell(rownum, 2).value)+"\n")
    finally:
        english.close()
        spanish.close()
        french.close()
    
    0 讨论(0)
  • 2020-12-18 13:19

    The u means it is a unicode string, it gets put there when you call str(). If you write the string out to a file it wont be there. What you are getting is 1 row from the column. It's because you are using end_rowx=1 it returns a list with one element.

    Try getting the column value lists:

    ids = sh.col_values(0, start_rowx=1)
    english = sh.col_values(1, start_rowx=1)
    spanish = sh.col_values(2, start_rowx=1)
    french = sh.col_values(3, start_rowx=1)
    

    and then you can zip them into tuple lists:

    english_with_IDS = zip(ids, english)
    spanish_with_IDS = zip(ids, spanish)
    french_with_IDS = zip(ids, french)
    

    Which are in the form:

    ("1", "Hello"),("2", "Hi"), ("3", "Bus")
    

    If you want to print the pairs:

    for id, word in english_with_IDS:
           print id + "=" + word
    

    col_values returns a list of column values, if you want single values you can call sh.cell_value(rowx, cellx).

    0 讨论(0)
  • 2020-12-18 13:19

    Use pandas:

    In [1]: import pandas as pd
    
    In [2]: df = pd.ExcelFile('test.xls').parse('Sheet1', index_col=0) # reads file
    
    In [3]: df.index = df.index.map(int)
    
    In [4]: for col in df.columns:
       ...:     column = df[col]
       ...:     column.to_csv(column.name, sep='=')  # writes each column to a file                                                    
       ...:                                          # with filename == column name
    
    In [5]: !cat English  # English file content
    1=Hello
    2=Hi
    3=Bus
    
    0 讨论(0)
提交回复
热议问题