I\'m trying to contact all excel files and worksheets in them into one using the below script. It kinda works but then the excel file c.xlsx is overwritten per file, so only
I got it working using the below script which uses @ryguy72's answer but works on all worksheets as well as the header row.
import pandas as pd
import numpy as np
import glob
all_data = pd.DataFrame()
for f in glob.glob("my_path/*.xlsx"):
df = pd.read_excel(f, sheet_name=None, ignore_index=True)
cdf = pd.concat(df.values())
all_data = all_data.append(cdf,ignore_index=True)
print(all_data)
df = pd.DataFrame(all_data)
df.shape
df.to_excel("my_path/final.xlsx", sheet_name='Sheet1')
I just tested the code below. It merges data from all Excel files in a folder into one, single, Excel file.
import pandas as pd
import numpy as np
import glob
glob.glob("C:\\your_path\\*.xlsx")
all_data = pd.DataFrame()
for f in glob.glob("C:\\your_path\\*.xlsx"):
df = pd.read_excel(f)
all_data = all_data.append(df,ignore_index=True)
print(all_data)
df = pd.DataFrame(all_data)
df.shape
df.to_excel("C:\\your_path\\final.xlsx", sheet_name='Sheet1')
Idea is create list of DataFrames in list comprehension, but because working with orderdict is necessary concat in loop and then again concat for one big final DataFrame:
cdf = [pd.read_excel(excel_names, sheet_name=None, ignore_index=True).values()
for excel_names in glob.glob('files/*.xlsx')]
df = pd.concat([pd.concat(x) for x in cdf], ignore_index=True)
#print (df)
df.to_excel("c.xlsx", index=False)