Sort data in a list to columns for xlsx files

匿名 (未验证) 提交于 2019-12-03 01:25:01

问题:

I have a list that looks something like this. (The data is from several xlsx files):

[['A B 10', 2, 'A B 10', 3, 1, AC], ['A B 104', 3, 'A B 104', 2, -1, 'AC']] [['D B 126', 3, 'D B 126', 2, -1, 'EFG 1'], ['D B 15', 3, 'D B 15', 2, -1, 'EFG 1'] [] [] [['D B 544', 2, 'D B 544', 1, -1, 'EFG 11'], ['D B 152', 3, 'D B 152', 2, -1, 'EFG 11'], ['D B 682', 3, 'D B 682', 2, -1, 'EFG 11'] 

I want to put this information into a new xlsx file, but first I need to sort the data into rows and columns. I want all of the first strings in each sub-lists to be added to the first column, the numbers in the second column etc. So the columns is to be where the comma signs are. I also do not want the list to have sub lists, so everything should be in the same list. Something like this:

['A B 10', 2, 'A B 10', 3, 1, AC,   'A B 104', 3, 'A B 104', 2, -1, 'AC'  'D B 126', 3, 'D B 126', 2, -1, 'EFG 1'  'D B 15', 3, 'D B 15', 2, -1, 'EFG 1'  'D B 544', 2, 'D B 544', 1, -1, 'EFG 11'  'D B 152', 3, 'D B 152', 2, -1, 'EFG 11'   'D B 682', 3, 'D B 682', 2, -1, 'EFG 11'] 

This is how far I have come on my code:

import pandas as pd from openpyxl import Workbook, load_workbook import glob from openpyxl.utils.dataframe import dataframe_to_rows  numbers = [] os.chdir(r'C:Myfolder') files = glob.glob('*.xlsx') print(files) for file in files: #Getting the data from xlsx files and to the numbers-list     df = pd.read_excel(file)     m = (df.iloc[:,4] - df.iloc[:,1]) != 0     pos = [0,1,3,4,6,7]     numbers = (df.loc[m, df.columns[pos]].values.tolist())     print(numbers) excel_input = load_workbook(rapp) #Going from using pandas and dataframes to working with openpyxl (Just because I am not that familiar with pandas). ws = excel_input.active for r in dataframe_to_rows(df, index=True, header=True):     ws.append(r) else:     pass  col1 = [] #Creating open lists to put the data into columns col2 = [] col4 = [] col5 = [] col7 = [] col8 = []  mainlist = [] try:     for row in numbers: #Putting the data into the columns lists         col1.append(ws.cell(row=row, column=1).value) #This is wrong, and is throwing the error         col2.append(ws.cell(row=row, column=2).value) #Wrong         col4.append(ws.cell(row=row, column=4).value) #Wrong         col5.append(ws.cell(row=row, column=5).value) #Wrong         col7.append(ws.cell(row=row, column=7).value) #Wrong         col8.append(ws.cell(row=row, column=8).value) #Wrong except AttributeError:     logging.error('Something is wrong') finally:     columns = zip(col1, col2, col4, col5, col7, col8) #Zipping the lists     for column in columns:         mainlist.append(column) 

Traceback: col1.append(ws.cell(row=row, column=1).value) File "C:\Python3\lib\site-packages\openpyxl\worksheet\worksheet.py", line 306, in cell if row < 1 or column < 1: TypeError: unorderable types: list() < int()

Does anyone know how to do this without getting the error? I have commented in the code where the mistake is.

EDIT:

By using @stovfl's method I was able to put some of the data into the xlsx file, but as my list is a list of list, only the last list was added to my xlsx file. I used this code:

from openpyxl import load_workbook  report = load_workbook(r"C:\Myworkbook.xlsx") ws = report.create_sheet('My sheet') for _list in numbers:     for row_data in _list:         ws.append([row_data])         print(row_data) report.save(r"C\Myworkbook.xlsx") 

The print:

A B 10 2 A B 10 3 1 AC 

The output in the xlsx file:

The problem is only the last list is added, not the entire list of lists and I wanted the output to look like this:

回答1:

Question: ... only the last list was added to my xlsx file

The First for loop is your for file in files:.
I updated my Code below to this.

Question: how to do this without getting the error?

  1. Solution writing your List of Lists using openpyxl, for instance

    newWorkbook = False newWorksheet = False  if newWorkbook:     from openpyxl import Workbook     wb = Workbook()     # Select First Worksheet     ws = wb.worksheets[0] else:     from openpyxl import load_workbook     wb = load_workbook("mySortData.xlsx")     if newWorksheet:         # Create a New Worksheet in this Workbook         ws = wb.create_chartsheet('Sheet 2')     else:         # Select a Worksheet by Name in this Workbook         ws = wb['Sheet']  for file in files:  # Getting the data from xlsx files and to the numbers-list     df = pd.read_excel(file)     m = (df.iloc[:, 4] - df.iloc[:, 1]) != 0     pos = [0, 1, 3, 4, 6, 7]     numbers = (df.loc[m, df.columns[pos]].values.tolist())      # numbers is a List[Row Data] of List[Columns]     # Iterate List of Row Data     for row_data in numbers:         ws.append(row_data)  wb.save("mySortData.xlsx") 

    Output:

  2. Solution writing your List of Lists direct to CSV, for instance:

    # Data == List of List data = [[['A B 10', 2, 'A B 10', 3, 1, 'AC'], ['A B 104', 3, 'A B 104', 2, -1, 'AC']], [['D B 126', 3, 'D B 126', 2, -1, 'EFG 1'], ['D B 15', 3, 'D B 15', 2, -1, 'EFG 1']], [], [], [['D B 544', 2, 'D B 544', 1, -1, 'EFG 11'], ['D B 152', 3, 'D B 152', 2, -1, 'EFG 11'], ['D B 682', 3, 'D B 682', 2, -1, 'EFG 11']], ]  import csv # Write to File with open('Output.csv', 'w') as csv_file:     writer = csv.writer(csv_file)     for _list in data:         for row_data in _list:             writer.writerow(row_data) 

    Qutput:

    A B 10,2,A B 10,3,1,AC A B 104,3,A B 104,2,-1,AC D B 126,3,D B 126,2,-1,EFG 1 D B 15,3,D B 15,2,-1,EFG 1 D B 544,2,D B 544,1,-1,EFG 11 D B 152,3,D B 152,2,-1,EFG 11 D B 682,3,D B 682,2,-1,EFG 11   

Tested with Python: 3.4.2 - openpyxl: 2.4.1



回答2:

I don't know if I understood well but if you have a list such as :

l = [rows, rows, rows, ...] 

In order to create your dataframe, you could just iterate over each element of your list l such as :

df = pd.DataFrame() for rows in l:     for row in rows:         df = pd.concat([df, row]) 

In your case, it gives me the following output :

    0       1   2       3   4   5 0   A B 10  2   A B 10  3   1   AC 1   A B 104 3   A B 104 2   -1  AC 0   D B 126 3   D B 126 2   -1  EFG 1 1   D B 15  3   D B 15  2   -1  EFG 1 0   D B 544 2   D B 544 1   -1  EFG 11 1   D B 152 3   D B 152 2   -1  EFG 11 2   D B 682 3   D B 682 2   -1  EFG 11 


回答3:

Looks like you can get the data into your Excel file fine. So why don't use you use something like:

col1 = [cell.value for cell in ws['A']]

or

ws.iter_cols(min_col=x, max_col=y)

If this isn't what you want then please rephrase the question to point out where the problem is.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!