Get column data by Column name and sheet name

限于喜欢 提交于 2019-12-07 00:57:25

问题


Is there a way to access all rows in a column in a specific sheet by using python xlrd.

e.g:

workbook = xlrd.open_workbook('ESC data.xlsx', on_demand=True)
sheet = workbook.sheet['sheetname']
arrayofvalues = sheet['columnname']

Or do i have to create a dictionary by myself?

The excel is pretty big so i would love to avoid iterating over all the colnames/sheets


回答1:


Yes, you are looking for the col_values() worksheet method. Instead of

arrayofvalues = sheet['columnname']

you need to do

arrayofvalues = sheet.col_values(columnindex)

where columnindex is the number of the column (counting from zero, so column A is index 0, column B is index 1, etc.). If you have a descriptive heading in the first row (or first few rows) you can give a second parameter that tells which row to start from (again, counting from zero). For example, if you have one header row, and thus want values starting in the second row, you could do

arrayofvalues = sheet.col_values(columnindex, 1)

Please check out the tutorial for a reasonably readable discussion of the xlrd package. (The official xlrd documentation is harder to read.)

Also note that (1) while you are free to use the name arrayofvalues, what you are really getting is a Python list, which technically isn't an array, and (2) the on_demand workbook parameter has no effect when working with .xlsx files, which means xlrd will attempt to load the entire workbook into memory regardless. (The on_demand feature works for .xls files.)




回答2:


This script allows to trasform a xls file to list of dictinnaries, all dict in list represent a row

import xlrd

workbook = xlrd.open_workbook('esc_data.xlss')
workbook = xlrd.open_workbook('esc_data.xlsx', on_demand = True)
worksheet = workbook.sheet_by_index(0)
first_row = [] # Header
for col in range(worksheet.ncols):
    first_row.append( worksheet.cell_value(0,col) )
# tronsform the workbook to a list of dictionnaries
data =[]
for row in range(1, worksheet.nrows):
    elm = {}
    for col in range(worksheet.ncols):
        elm[first_row[col]]=worksheet.cell_value(row,col)
    data.append(elm)
print data


来源:https://stackoverflow.com/questions/38309256/get-column-data-by-column-name-and-sheet-name

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!