dataframe | 易学教程

Merge 2 columns into 1 column

阅读更多关于 Merge 2 columns into 1 column

问题 I will like to merge 2 columns into 1 column and remove nan. I have this data: Name A B Pikachu 2007 nan Pikachu nan 2008 Raichu 2007 nan Mew nan 2018 Expected Result: Name Year Pikachu 2007 Pikachu 2008 Raichu 2007 Mew 2008 Code I tried: df['Year']= df['A','B'].astype(str).apply(''.join,1) But my result is this: Name Year Pikachu 2007nan Pikachu nan2008 Raichu 2007nan Mew nan2008 回答1: Use Series.fillna with DataFrame.pop for extract columns and last convert to integers: df['Year']= df.pop('A

Iterate each row by updating values from 1st dataframe to 2nd dataframe based on unique value w/ different index, otherwise append and assign new ID

阅读更多关于 Iterate each row by updating values from 1st dataframe to 2nd dataframe based on unique value w/ different index, otherwise append and assign new ID

问题 Trying to update each row from df1 to df2 if an unique value is matched. If not, append the row to df2 and assign new ID column. df1 ( NO ID COLUMN ): unique_value Status Price 0 xyz123 bad 6.67 1 eff987 bad 1.75 2 efg125 okay 5.77 df2: unique_value Status Price ID 0 xyz123 good 1.25 1000 1 xyz123 good 1.25 1000 2 xyz123 good 1.25 1000 3 xyz123 good 1.25 1000 4 xyz985 bad 1.31 1001 5 abc987 okay 4.56 1002 6 eff987 good 9.85 1003 7 asd541 excellent 8.85 1004 Desired output for updated df2:

Merge 2 columns into 1 column

阅读更多关于 Merge 2 columns into 1 column

how to plot a dataframe grouped by two columns in matplotlib and pandas

阅读更多关于 how to plot a dataframe grouped by two columns in matplotlib and pandas

问题 I have the following dataframe: total_gross_profit first_day_week var Feb-06 1 45293.09 2 61949.54 Feb-13 1 44634.72 2 34584.15 Feb-20 1 43796.89 2 37308.57 Feb-27 1 44136.21 2 38237.67 Jan-16 1 74695.91 2 75702.02 Jan-23 1 86101.05 2 69518.39 Jan-30 1 65913.56 2 74823.94 Mar-06 1 34256.47 2 31953.00 grouped by first_day_week and var columns i need to make a bar plot where i have first_day_week in x axis and for each entry in first_day_week two bar plot for each value in var in different

how to plot a dataframe grouped by two columns in matplotlib and pandas

阅读更多关于 how to plot a dataframe grouped by two columns in matplotlib and pandas

How to subtract the mean of a month from each day in that month?

阅读更多关于 How to subtract the mean of a month from each day in that month?

问题 I have a time-series of 55 years at a daily scale. I have found the monthly mean of each month for each year. Now I want to subtract this monthly mean from the corresponding days of that month and year. My pandas data frame looks like this: 0 1 2 3 ... 5 6 7 8 Date ... 1951-01-01 28.361 0.0 131.24 405.39 ... 405.39 38.284 0.187010 -1.23550 1951-01-02 27.874 0.0 113.74 409.56 ... 409.56 49.834 0.066903 -1.44770 ... ... ... ... ... ... ... ... ... 2005-12-16 27.921 0.0 104.99 429.78 ... 429.78

Cosine Similarity rows in a dataframe of pandas

阅读更多关于 Cosine Similarity rows in a dataframe of pandas

问题 I have a CSV file which have content as belows and I want to calculate the cosine similarity from one the remaining ID in the CSV file. I have load it into a dataframe of pandas as follows: old_df['Vector']=old_df.apply(lambda row: np.array(np.matrix(row.Vector)).ravel(), axis = 1) l=[] for a in old_df['Vector']: l.append(a) A=np.array(l) similarities = cosine_similarity(A) The output looks fine. However, i do not know how to find which the GUID (or ID)similar to other GUID (or ID), and I

How to left align a dataframe column in python?

阅读更多关于 How to left align a dataframe column in python?

问题 Have to left align a description column in the pandas dataframe in python. Similar to left or right align a cell in excel sheet. is there any solution for this? Image attached for reference. !Dataset 回答1: Try this df.style.set_properties(subset=["col1", "col2"], **{'text-align': 'right'}) 回答2: I think you can just remove the leading spaces. df.Description = df.Description.apply(lambda row: row.lstrip(' ')) 来源： https://stackoverflow.com/questions/53460941/how-to-left-align-a-dataframe-column

Element-wise mean of a list of pandas DataFrames

阅读更多关于 Element-wise mean of a list of pandas DataFrames

问题 Is there a canonical way to compute the element-wise mean of a list of DataFrames with identical columns and indices? The best way I can think of is from functools import reduce dfs = [df1, df2, df3, df4, df5] reduce(lambda x, y: x.add(y), dfs) / len(dfs) 回答1: Use concat with mean per index values: df1 = pd.DataFrame({ 'C':[7,8,9], 'D':[1,3,5], }) df2 = pd.DataFrame({ 'C':[4,2,3], 'D':[7,1,0], }) df3 = pd.DataFrame({ 'C':[9,4,2], 'D':[1,7,1], }) from functools import reduce dfs = [df1, df2,

Pandas Time Series DataFrame Missing Values

阅读更多关于 Pandas Time Series DataFrame Missing Values

问题 I have a dataset of Total Sales from 2008-2015. I have an entry for each day, and so I have a created a pandas DataFrame with a DatetimeIndex and a column for sales. So it looks like this The problem is that I am missing data for most of 2010. These missing values are currently represented by 0.0 so if I plot the DataFrame I get I want to try forecast values for 2016, possibly using an ARIMA model, so the first step I took was to perform a decomposition of this time series Obviously if I