pandas

How do I use my first row in my spreadsheet for my Dataframe column names instead of 0 1 2…etc?

自古美人都是妖i 提交于 2021-02-10 12:09:48
问题 I want my dataframe to display the first row names as my dataframe column name instead of numbering from 0 etc. How do I do this? I tried using pandas and openpyxl modules to turn my Excel spreadsheet into a dataframe. import pandas as pd from openpyxl import load_workbook from openpyxl.utils.dataframe import dataframe_to_rows wb = load_workbook(filename='Budget1.xlsx') print(wb.sheetnames) sheet_ranges=wb['May 2019'] print(sheet_ranges['A3'].value) ws=wb['May 2019'] df=pd.DataFrame(ws.values

How do I use my first row in my spreadsheet for my Dataframe column names instead of 0 1 2…etc?

穿精又带淫゛_ 提交于 2021-02-10 12:08:17
问题 I want my dataframe to display the first row names as my dataframe column name instead of numbering from 0 etc. How do I do this? I tried using pandas and openpyxl modules to turn my Excel spreadsheet into a dataframe. import pandas as pd from openpyxl import load_workbook from openpyxl.utils.dataframe import dataframe_to_rows wb = load_workbook(filename='Budget1.xlsx') print(wb.sheetnames) sheet_ranges=wb['May 2019'] print(sheet_ranges['A3'].value) ws=wb['May 2019'] df=pd.DataFrame(ws.values

Renaming selected columns in pandas [duplicate]

不问归期 提交于 2021-02-10 12:07:49
问题 This question already has answers here : Changing multiple column names but not all of them - Pandas Python (4 answers) Closed 1 year ago . I am trying to rename selected columns (say the two las columns) in my data frame using the iloc and df.columns functions but it does not seem to work for me and I can't figure out why. Here is a toy example of what I want to achieve: import pandas as pd d = {'one': list(range(5)), 'two': list(range(5)), 'three': list(range(5)), 'four': list(range(5)),

How to reshape dataframe with wide_to_long or pivot?

北城余情 提交于 2021-02-10 11:55:33
问题 This should be fairly simple but have not been able to wrap my brain around it. I am trying to convert df1 to df2, where df1 and df2 are pandas dataframes df1 = pd.DataFrame({'site': ['1', '2'], 'sat_open': ['0900', '0900'], 'sat_close': ['1900','1900'], 'sun_open': ['1000', '1000'], 'sun_close': ['1800', '1800'], 'mon_open': ['0900', '0900'], 'mon_close': ['2100', '2100'] }) df2 = pd.DataFrame({'store': ['1', '1', '1', '2', '2','2'], 'day': ['sat', 'sun', 'mon','sat', 'sun', 'mon'], 'open':

How to reshape dataframe with wide_to_long or pivot?

*爱你&永不变心* 提交于 2021-02-10 11:55:26
问题 This should be fairly simple but have not been able to wrap my brain around it. I am trying to convert df1 to df2, where df1 and df2 are pandas dataframes df1 = pd.DataFrame({'site': ['1', '2'], 'sat_open': ['0900', '0900'], 'sat_close': ['1900','1900'], 'sun_open': ['1000', '1000'], 'sun_close': ['1800', '1800'], 'mon_open': ['0900', '0900'], 'mon_close': ['2100', '2100'] }) df2 = pd.DataFrame({'store': ['1', '1', '1', '2', '2','2'], 'day': ['sat', 'sun', 'mon','sat', 'sun', 'mon'], 'open':

parsing text file into pandas dataframe

坚强是说给别人听的谎言 提交于 2021-02-10 11:51:58
问题 i have a text file with continuous data. The following text file contains 2 lines Example: 123@#{} 456@$% 1 23 Also, I have column lengths given as 2,3,4 for 3 columns that i need in my data frame. I want to parse the file into a pandas data frame such that the first column gets the first 2 letters, the second column gets the next 3 letters and so on as per the column lengths given (2,3,4).. the next set of letters should form the next row and so on... so my pandas data frame should look like

function returning pandas dataframe

非 Y 不嫁゛ 提交于 2021-02-10 11:49:42
问题 I was not clear about my issue, so I am reviewing the question. I have a function manipulating a generic dataframe (it removes and renames columns and records): def manipulate_df(df_local): df_local.rename(columns={'A': 'grouping_column'}, inplace = True) df_local.drop('B', axis=1, inplace=True) df_local.drop(df.query('grouping_column not in (\'1\', \'0\')').index, inplace = True) df_local = df_local.groupby(['grouping_column'])['C'].sum().to_frame().reset_index().copy() print("this is what I

function returning pandas dataframe

点点圈 提交于 2021-02-10 11:48:46
问题 I was not clear about my issue, so I am reviewing the question. I have a function manipulating a generic dataframe (it removes and renames columns and records): def manipulate_df(df_local): df_local.rename(columns={'A': 'grouping_column'}, inplace = True) df_local.drop('B', axis=1, inplace=True) df_local.drop(df.query('grouping_column not in (\'1\', \'0\')').index, inplace = True) df_local = df_local.groupby(['grouping_column'])['C'].sum().to_frame().reset_index().copy() print("this is what I

How to change column values or create values in a new column based on values in existing column?

你说的曾经没有我的故事 提交于 2021-02-10 10:56:30
问题 I am new to ML and Data Science (recently graduated from Master's in Business Analytics) and learning as much as I can by myself now while looking for positions in Data Science / Business Analytics. I am working on my personal project to build ML algorithms to predict if a customer will show up to their existing appointment. Upon initial data analysis, I notice that my "No-show" column contains values "Yes" and "No" (Metadata: if a customer scheduled an appointment and showed up for an

How to set frequency with pd.to_datetime()?

对着背影说爱祢 提交于 2021-02-10 10:55:12
问题 When fitting a statsmodel, I'm receiving a warning about the date frequency. First, I import a dataset: import statsmodels as sm df = sm.datasets.get_rdataset(package='datasets', dataname='airquality').data df['Year'] = 1973 df['Date'] = pd.to_datetime(df[['Year', 'Month', 'Day']]) df.drop(columns=['Year', 'Month', 'Day'], inplace=True) df.set_index('Date', inplace=True, drop=True) Next I try to fit a SES model: fit = sm.tsa.api.SimpleExpSmoothing(df['Wind']).fit() Which returns this warning: