dataframe

Merge 2 columns into 1 column

孤街醉人 提交于 2021-02-10 07:33:35
问题 I will like to merge 2 columns into 1 column and remove nan. I have this data: Name A B Pikachu 2007 nan Pikachu nan 2008 Raichu 2007 nan Mew nan 2018 Expected Result: Name Year Pikachu 2007 Pikachu 2008 Raichu 2007 Mew 2008 Code I tried: df['Year']= df['A','B'].astype(str).apply(''.join,1) But my result is this: Name Year Pikachu 2007nan Pikachu nan2008 Raichu 2007nan Mew nan2008 回答1: Use Series.fillna with DataFrame.pop for extract columns and last convert to integers: df['Year']= df.pop('A

Iterate each row by updating values from 1st dataframe to 2nd dataframe based on unique value w/ different index, otherwise append and assign new ID

六月ゝ 毕业季﹏ 提交于 2021-02-10 07:33:17
问题 Trying to update each row from df1 to df2 if an unique value is matched. If not, append the row to df2 and assign new ID column. df1 ( NO ID COLUMN ): unique_value Status Price 0 xyz123 bad 6.67 1 eff987 bad 1.75 2 efg125 okay 5.77 df2: unique_value Status Price ID 0 xyz123 good 1.25 1000 1 xyz123 good 1.25 1000 2 xyz123 good 1.25 1000 3 xyz123 good 1.25 1000 4 xyz985 bad 1.31 1001 5 abc987 okay 4.56 1002 6 eff987 good 9.85 1003 7 asd541 excellent 8.85 1004 Desired output for updated df2:

Merge 2 columns into 1 column

帅比萌擦擦* 提交于 2021-02-10 07:32:17
问题 I will like to merge 2 columns into 1 column and remove nan. I have this data: Name A B Pikachu 2007 nan Pikachu nan 2008 Raichu 2007 nan Mew nan 2018 Expected Result: Name Year Pikachu 2007 Pikachu 2008 Raichu 2007 Mew 2008 Code I tried: df['Year']= df['A','B'].astype(str).apply(''.join,1) But my result is this: Name Year Pikachu 2007nan Pikachu nan2008 Raichu 2007nan Mew nan2008 回答1: Use Series.fillna with DataFrame.pop for extract columns and last convert to integers: df['Year']= df.pop('A

how to plot a dataframe grouped by two columns in matplotlib and pandas

梦想的初衷 提交于 2021-02-10 06:52:30
问题 I have the following dataframe: total_gross_profit first_day_week var Feb-06 1 45293.09 2 61949.54 Feb-13 1 44634.72 2 34584.15 Feb-20 1 43796.89 2 37308.57 Feb-27 1 44136.21 2 38237.67 Jan-16 1 74695.91 2 75702.02 Jan-23 1 86101.05 2 69518.39 Jan-30 1 65913.56 2 74823.94 Mar-06 1 34256.47 2 31953.00 grouped by first_day_week and var columns i need to make a bar plot where i have first_day_week in x axis and for each entry in first_day_week two bar plot for each value in var in different

how to plot a dataframe grouped by two columns in matplotlib and pandas

感情迁移 提交于 2021-02-10 06:52:12
问题 I have the following dataframe: total_gross_profit first_day_week var Feb-06 1 45293.09 2 61949.54 Feb-13 1 44634.72 2 34584.15 Feb-20 1 43796.89 2 37308.57 Feb-27 1 44136.21 2 38237.67 Jan-16 1 74695.91 2 75702.02 Jan-23 1 86101.05 2 69518.39 Jan-30 1 65913.56 2 74823.94 Mar-06 1 34256.47 2 31953.00 grouped by first_day_week and var columns i need to make a bar plot where i have first_day_week in x axis and for each entry in first_day_week two bar plot for each value in var in different

How to subtract the mean of a month from each day in that month?

為{幸葍}努か 提交于 2021-02-10 06:46:28
问题 I have a time-series of 55 years at a daily scale. I have found the monthly mean of each month for each year. Now I want to subtract this monthly mean from the corresponding days of that month and year. My pandas data frame looks like this: 0 1 2 3 ... 5 6 7 8 Date ... 1951-01-01 28.361 0.0 131.24 405.39 ... 405.39 38.284 0.187010 -1.23550 1951-01-02 27.874 0.0 113.74 409.56 ... 409.56 49.834 0.066903 -1.44770 ... ... ... ... ... ... ... ... ... 2005-12-16 27.921 0.0 104.99 429.78 ... 429.78

Cosine Similarity rows in a dataframe of pandas

て烟熏妆下的殇ゞ 提交于 2021-02-10 06:45:09
问题 I have a CSV file which have content as belows and I want to calculate the cosine similarity from one the remaining ID in the CSV file. I have load it into a dataframe of pandas as follows: old_df['Vector']=old_df.apply(lambda row: np.array(np.matrix(row.Vector)).ravel(), axis = 1) l=[] for a in old_df['Vector']: l.append(a) A=np.array(l) similarities = cosine_similarity(A) The output looks fine. However, i do not know how to find which the GUID (or ID)similar to other GUID (or ID), and I

How to left align a dataframe column in python?

纵饮孤独 提交于 2021-02-10 06:31:41
问题 Have to left align a description column in the pandas dataframe in python. Similar to left or right align a cell in excel sheet. is there any solution for this? Image attached for reference. !Dataset 回答1: Try this df.style.set_properties(subset=["col1", "col2"], **{'text-align': 'right'}) 回答2: I think you can just remove the leading spaces. df.Description = df.Description.apply(lambda row: row.lstrip(' ')) 来源: https://stackoverflow.com/questions/53460941/how-to-left-align-a-dataframe-column

Element-wise mean of a list of pandas DataFrames

感情迁移 提交于 2021-02-10 06:29:06
问题 Is there a canonical way to compute the element-wise mean of a list of DataFrames with identical columns and indices? The best way I can think of is from functools import reduce dfs = [df1, df2, df3, df4, df5] reduce(lambda x, y: x.add(y), dfs) / len(dfs) 回答1: Use concat with mean per index values: df1 = pd.DataFrame({ 'C':[7,8,9], 'D':[1,3,5], }) df2 = pd.DataFrame({ 'C':[4,2,3], 'D':[7,1,0], }) df3 = pd.DataFrame({ 'C':[9,4,2], 'D':[1,7,1], }) from functools import reduce dfs = [df1, df2,

Pandas Time Series DataFrame Missing Values

半腔热情 提交于 2021-02-10 06:09:06
问题 I have a dataset of Total Sales from 2008-2015. I have an entry for each day, and so I have a created a pandas DataFrame with a DatetimeIndex and a column for sales. So it looks like this The problem is that I am missing data for most of 2010. These missing values are currently represented by 0.0 so if I plot the DataFrame I get I want to try forecast values for 2016, possibly using an ARIMA model, so the first step I took was to perform a decomposition of this time series Obviously if I