xlrd

How to obtain sheet names from XLS files without loading the whole file?

情到浓时终转凉″ 提交于 2019-11-27 00:21:53
问题 I'm currently using pandas to read an Excel file and present its sheet names to the user, so he can select which sheet he would like to use. The problem is that the files are really big (70 columns x 65k rows), taking up to 14s to load on a notebook (the same data in a CSV file is taking 3s). My code in panda goes like this: xls = pandas.ExcelFile(path) sheets = xls.sheet_names I tried xlrd before, but obtained similar results. This was my code with xlrd: xls = xlrd.open_workbook(path) sheets

Get formula from Excel cell with python xlrd

一曲冷凌霜 提交于 2019-11-26 22:32:29
I have to port an algorithm from an Excel sheet to python code but I have to reverse engineer the algorithm from the Excel file . The Excel sheet is quite complicated, it contains many cells in which there are formulas that refer to other cells (that can also contains a formula or a constant). My idea is to analyze with a python script the sheet building a sort of table of dependencies between cells, that is: A1 depends on B4,C5,E7 formula: "=sqrt(B4)+C5*E7" A2 depends on B5,C6 formula: "=sin(B5)*C6" ... The xlrd python module allows to read an XLS workbook but at the moment I can access to

How to use ``xlrd.xldate_as_tuple()``

爷,独闯天下 提交于 2019-11-26 22:23:30
问题 I am not quite sure how to use the following function: xlrd.xldate_as_tuple for the following data xldate:39274.0 xldate:39839.0 Could someone please give me an example on usage of the function for the data? 回答1: Quoth the documentation: Dates in Excel spreadsheets In reality, there are no such things. What you have are floating point numbers and pious hope. There are several problems with Excel dates: (1) Dates are not stored as a separate data type; they are stored as floating point numbers

How to get Excel cell properties in Python

最后都变了- 提交于 2019-11-26 21:22:25
问题 Actually I am using xlrd module 0.8 version, but I don't know how to read cell properties like background color, font, and whether cell is locked. I tried to use import xlrd wb = xlrd.open_workbook(...) sh = wb.sheet_by_index(...) sh.sh._cell_xf_indexes(2, 2) It raises an error saying formatting information needs to be set while reading wb , but if I had that parameter then it shows it is still not implemented. Is there another module or how can this module itself be made to read cell

Compare 2 excel files using Python

社会主义新天地 提交于 2019-11-26 20:29:12
问题 I have two xlsx files as follows: value1 value2 value3 0.456 3.456 0.4325436 6.24654 0.235435 6.376546 4.26545 4.264543 7.2564523 and value1 value2 value3 0.456 3.456 0.4325436 6.24654 0.23546 6.376546 4.26545 4.264543 7.2564523 I need to compare all cells, and if a cell from file1 != a cell from file2 print that. import xlrd rb = xlrd.open_workbook('file1.xlsx') rb1 = xlrd.open_workbook('file2.xlsx') sheet = rb.sheet_by_index(0) for rownum in range(sheet.nrows): row = sheet.row_values(rownum

python 3 操作 excel

亡梦爱人 提交于 2019-11-26 17:38:07
看到一篇很好的python读写excel方式的对比文章: 用Python读写Excel文件 关于其他版本的excel,可以通过他提供的链接教程进行学习。 XlsxWriter: https://github.com/jmcnamara/XlsxWriter http://xlsxwriter.readthedocs.org openpyxl: http://openpyxl.readthedocs.io/en/default/ Microsoft excel API: https://msdn.microsoft.com/en-us/library/fp179694.aspx 简介 xlrd用来读取excel文件,xlwt用来写excel文件,它们合作来对excel进行操作。 官方文档: http://www.python-excel.org/ xlrd官方介绍: https://pypi.python.org/pypi/xlrd/1.0.0 xlwt官方介绍: https://pypi.python.org/pypi/xlwt/1.1.2 xlutils官方介绍: https://pypi.python.org/pypi/xlutils http://xlutils.readthedocs.io/en/latest/ 1. 关于xlrd: Library for

Error: Unsupported format, or corrupt file: Expected BOF record

浪尽此生 提交于 2019-11-26 17:15:07
问题 I am trying to open a xlsx file and just print the contents of it. I keep running into this error: import xlrd book = xlrd.open_workbook("file.xlsx") print "The number of worksheets is", book.nsheets print "Worksheet name(s):", book.sheet_names() print sh = book.sheet_by_index(0) print sh.name, sh.nrows, sh.ncols print print "Cell D30 is", sh.cell_value(rowx=29, colx=3) print for rx in range(5): print sh.row(rx) print It prints out this error raise XLRDError('Unsupported format, or corrupt

Reading date as a string not float from excel using python xlrd [duplicate]

旧巷老猫 提交于 2019-11-26 16:35:52
问题 This question already has answers here : Closed 6 years ago . Possible Duplicate: How do I read a date in Excel format in Python? My date can be among any field in an excel file but when I read it using python xlrd its being read as a float. Is there a way to read all the excel cells as string? I want to prepare a script to generate a file having all the values in excel file separated by a pipe but this date thing is creating problem. 回答1: Excel stores dates as floats. If you want to convert

Integers from excel files become floats?

早过忘川 提交于 2019-11-26 16:26:54
问题 I use xlrd to read data from excel files. For integers stored in the files, let's say 63 , the xlrd interprets it as 63.0 of type number . Why can't xlrd recognize 63 as an integer? Assume sheet.row(1)[0].value gives us 63.0 . How can I convert it back to 63 . 回答1: Looking at the different cell types in the documentation, it seems that there isn't any integer type for cells, they're just floats. Hence, that's the reason you're getting floats back even when you wrote an integer. To convert a

GroupBy results to dictionary of lists

岁酱吖の 提交于 2019-11-26 13:58:14
问题 I have an excel sheet that looks like so: Column1 Column2 Column3 0 23 1 1 5 2 1 2 3 1 19 5 2 56 1 2 22 2 3 2 4 3 14 5 4 59 1 5 44 1 5 1 2 5 87 3 And I'm looking to extract that data, group it by column 1, and add it to a dictionary so it appears like this: {0: [1], 1: [2,3,5], 2: [1,2], 3: [4,5], 4: [1], 5: [1,2,3]} This is my code so far excel = pandas.read_excel(r"e:\test_data.xlsx", sheetname='mySheet', parse_cols'A,C') myTable = excel.groupby("Column1").groups print myTable However, my