pandas

Merge 2 columns into 1 column

孤街醉人 提交于 2021-02-10 07:33:35
问题 I will like to merge 2 columns into 1 column and remove nan. I have this data: Name A B Pikachu 2007 nan Pikachu nan 2008 Raichu 2007 nan Mew nan 2018 Expected Result: Name Year Pikachu 2007 Pikachu 2008 Raichu 2007 Mew 2008 Code I tried: df['Year']= df['A','B'].astype(str).apply(''.join,1) But my result is this: Name Year Pikachu 2007nan Pikachu nan2008 Raichu 2007nan Mew nan2008 回答1: Use Series.fillna with DataFrame.pop for extract columns and last convert to integers: df['Year']= df.pop('A

Iterate each row by updating values from 1st dataframe to 2nd dataframe based on unique value w/ different index, otherwise append and assign new ID

六月ゝ 毕业季﹏ 提交于 2021-02-10 07:33:17
问题 Trying to update each row from df1 to df2 if an unique value is matched. If not, append the row to df2 and assign new ID column. df1 ( NO ID COLUMN ): unique_value Status Price 0 xyz123 bad 6.67 1 eff987 bad 1.75 2 efg125 okay 5.77 df2: unique_value Status Price ID 0 xyz123 good 1.25 1000 1 xyz123 good 1.25 1000 2 xyz123 good 1.25 1000 3 xyz123 good 1.25 1000 4 xyz985 bad 1.31 1001 5 abc987 okay 4.56 1002 6 eff987 good 9.85 1003 7 asd541 excellent 8.85 1004 Desired output for updated df2:

Difference between numpy var() and pandas var()

旧城冷巷雨未停 提交于 2021-02-10 07:32:51
问题 I recently encountered a thing which made me notice that numpy.var() and pandas.DataFrame.var() or pandas.Series.var() are giving different values. I want to know if there is any difference between them or not? Here is my dataset. Country GDP Area Continent 0 India 2.79 3.287 Asia 1 USA 20.54 9.840 North America 2 China 13.61 9.590 Asia Here is my code: from sklearn.preprocessing import StandardScaler ss = StandardScaler() catDf.iloc[:,1:-1] = ss.fit_transform(catDf.iloc[:,1:-1]) Now checking

Merge 2 columns into 1 column

帅比萌擦擦* 提交于 2021-02-10 07:32:17
问题 I will like to merge 2 columns into 1 column and remove nan. I have this data: Name A B Pikachu 2007 nan Pikachu nan 2008 Raichu 2007 nan Mew nan 2018 Expected Result: Name Year Pikachu 2007 Pikachu 2008 Raichu 2007 Mew 2008 Code I tried: df['Year']= df['A','B'].astype(str).apply(''.join,1) But my result is this: Name Year Pikachu 2007nan Pikachu nan2008 Raichu 2007nan Mew nan2008 回答1: Use Series.fillna with DataFrame.pop for extract columns and last convert to integers: df['Year']= df.pop('A

Increasing performance of nearest neighbors of rows in Pandas

白昼怎懂夜的黑 提交于 2021-02-10 07:26:29
问题 I am given 8000x3 data set similar to this one: import pandas as pd import numpy as np df = pd.DataFrame(np.random.rand(8000,3), columns=list('XYZ')) So for a visual reference, df.head(5) looks like this: X Y Z 0 0.462433 0.559442 0.016778 1 0.663771 0.092044 0.636519 2 0.111489 0.676621 0.839845 3 0.244361 0.599264 0.505175 4 0.115844 0.888622 0.766014 I'm trying to implement a method that when given an index from the dataset, it will return similar items from the dataset (in some reasonable

Python combining all csv files in a directory and order by date time

我怕爱的太早我们不能终老 提交于 2021-02-10 07:25:48
问题 I have 2 years worth of daily data split into monthly files. I would like to combine all of this data into one file ordered by date and time. The code I am using combines all the files, but not in order. Code I am using import pandas as pd import glob, os import csv inputdirectory = input('Enter the directory: ') df_list = [] for filename in sorted(glob.glob(os.path.join(inputdirectory,"*.csv*"))): df_list.append(pd.read_csv(filename)) full_df = pd.concat(df_list) full_df.to_csv('totalsum.csv

Rename columns regex, keep name if no match

早过忘川 提交于 2021-02-10 07:15:23
问题 data = {'First_Column': [1,2,3], 'Second_Column': [1,2,3], '\First\Mid\LAST.Ending': [1,2,3], 'First1\Mid1\LAST1.Ending': [1,2,3]} df = pd.DataFrame(data) First_Column Second_Column \First\Mid\LAST.Ending First1\Mid1\LAST1.Ending 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 I want to rename the columns as follows: First_Column Second_Column LAST LAST1 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 So i tried: df.columns.str.extract(r'([^\\]+)\.Ending') 0 0 NaN 1 NaN 2 LAST 3 LAST1 and col = df.columns.tolist() for i in col

Read .txt file with Python Pandas - strings and floats

怎甘沉沦 提交于 2021-02-10 07:15:20
问题 I would like to read a .txt file in Python (3.6.0) using Pandas. The first lines of the .txt file is shown below: Text file to read Location: XXX Campaign Name: XXX Date of log start: 2016_10_09 Time of log start: 04:27:28 Sampling Frequency: 1Hz Config file: XXX Logger Serial: XXX CH Mapping;;XXXC1;XXXC2;XXXC3;XXXC4 CH Offsets in ms;;X;X,X;X;X,X CH Units;;mA;mA;mA;mA Time;msec;Channel1;Channel2;Channel3;Channel4 04:30:00;000; 0.01526;10.67903;10.58366; 0.00000 04:30:01;000; 0.17090;10.68666

Rename columns regex, keep name if no match

家住魔仙堡 提交于 2021-02-10 07:15:08
问题 data = {'First_Column': [1,2,3], 'Second_Column': [1,2,3], '\First\Mid\LAST.Ending': [1,2,3], 'First1\Mid1\LAST1.Ending': [1,2,3]} df = pd.DataFrame(data) First_Column Second_Column \First\Mid\LAST.Ending First1\Mid1\LAST1.Ending 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 I want to rename the columns as follows: First_Column Second_Column LAST LAST1 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 So i tried: df.columns.str.extract(r'([^\\]+)\.Ending') 0 0 NaN 1 NaN 2 LAST 3 LAST1 and col = df.columns.tolist() for i in col

Read .txt file with Python Pandas - strings and floats

人走茶凉 提交于 2021-02-10 07:14:58
问题 I would like to read a .txt file in Python (3.6.0) using Pandas. The first lines of the .txt file is shown below: Text file to read Location: XXX Campaign Name: XXX Date of log start: 2016_10_09 Time of log start: 04:27:28 Sampling Frequency: 1Hz Config file: XXX Logger Serial: XXX CH Mapping;;XXXC1;XXXC2;XXXC3;XXXC4 CH Offsets in ms;;X;X,X;X;X,X CH Units;;mA;mA;mA;mA Time;msec;Channel1;Channel2;Channel3;Channel4 04:30:00;000; 0.01526;10.67903;10.58366; 0.00000 04:30:01;000; 0.17090;10.68666