pandas | 易学教程

Merge 2 columns into 1 column

阅读更多关于 Merge 2 columns into 1 column

问题 I will like to merge 2 columns into 1 column and remove nan. I have this data: Name A B Pikachu 2007 nan Pikachu nan 2008 Raichu 2007 nan Mew nan 2018 Expected Result: Name Year Pikachu 2007 Pikachu 2008 Raichu 2007 Mew 2008 Code I tried: df['Year']= df['A','B'].astype(str).apply(''.join,1) But my result is this: Name Year Pikachu 2007nan Pikachu nan2008 Raichu 2007nan Mew nan2008 回答1: Use Series.fillna with DataFrame.pop for extract columns and last convert to integers: df['Year']= df.pop('A

Iterate each row by updating values from 1st dataframe to 2nd dataframe based on unique value w/ different index, otherwise append and assign new ID

阅读更多关于 Iterate each row by updating values from 1st dataframe to 2nd dataframe based on unique value w/ different index, otherwise append and assign new ID

问题 Trying to update each row from df1 to df2 if an unique value is matched. If not, append the row to df2 and assign new ID column. df1 ( NO ID COLUMN ): unique_value Status Price 0 xyz123 bad 6.67 1 eff987 bad 1.75 2 efg125 okay 5.77 df2: unique_value Status Price ID 0 xyz123 good 1.25 1000 1 xyz123 good 1.25 1000 2 xyz123 good 1.25 1000 3 xyz123 good 1.25 1000 4 xyz985 bad 1.31 1001 5 abc987 okay 4.56 1002 6 eff987 good 9.85 1003 7 asd541 excellent 8.85 1004 Desired output for updated df2:

Difference between numpy var() and pandas var()

阅读更多关于 Difference between numpy var() and pandas var()

问题 I recently encountered a thing which made me notice that numpy.var() and pandas.DataFrame.var() or pandas.Series.var() are giving different values. I want to know if there is any difference between them or not? Here is my dataset. Country GDP Area Continent 0 India 2.79 3.287 Asia 1 USA 20.54 9.840 North America 2 China 13.61 9.590 Asia Here is my code: from sklearn.preprocessing import StandardScaler ss = StandardScaler() catDf.iloc[:,1:-1] = ss.fit_transform(catDf.iloc[:,1:-1]) Now checking

Merge 2 columns into 1 column

阅读更多关于 Merge 2 columns into 1 column

Increasing performance of nearest neighbors of rows in Pandas

阅读更多关于 Increasing performance of nearest neighbors of rows in Pandas

问题 I am given 8000x3 data set similar to this one: import pandas as pd import numpy as np df = pd.DataFrame(np.random.rand(8000,3), columns=list('XYZ')) So for a visual reference, df.head(5) looks like this: X Y Z 0 0.462433 0.559442 0.016778 1 0.663771 0.092044 0.636519 2 0.111489 0.676621 0.839845 3 0.244361 0.599264 0.505175 4 0.115844 0.888622 0.766014 I'm trying to implement a method that when given an index from the dataset, it will return similar items from the dataset (in some reasonable

Python combining all csv files in a directory and order by date time

阅读更多关于 Python combining all csv files in a directory and order by date time

问题 I have 2 years worth of daily data split into monthly files. I would like to combine all of this data into one file ordered by date and time. The code I am using combines all the files, but not in order. Code I am using import pandas as pd import glob, os import csv inputdirectory = input('Enter the directory: ') df_list = [] for filename in sorted(glob.glob(os.path.join(inputdirectory,"*.csv*"))): df_list.append(pd.read_csv(filename)) full_df = pd.concat(df_list) full_df.to_csv('totalsum.csv

Rename columns regex, keep name if no match

阅读更多关于 Rename columns regex, keep name if no match

问题 data = {'First_Column': [1,2,3], 'Second_Column': [1,2,3], '\First\Mid\LAST.Ending': [1,2,3], 'First1\Mid1\LAST1.Ending': [1,2,3]} df = pd.DataFrame(data) First_Column Second_Column \First\Mid\LAST.Ending First1\Mid1\LAST1.Ending 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 I want to rename the columns as follows: First_Column Second_Column LAST LAST1 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 So i tried: df.columns.str.extract(r'([^\\]+)\.Ending') 0 0 NaN 1 NaN 2 LAST 3 LAST1 and col = df.columns.tolist() for i in col

Read .txt file with Python Pandas - strings and floats

阅读更多关于 Read .txt file with Python Pandas - strings and floats

问题 I would like to read a .txt file in Python (3.6.0) using Pandas. The first lines of the .txt file is shown below: Text file to read Location: XXX Campaign Name: XXX Date of log start: 2016_10_09 Time of log start: 04:27:28 Sampling Frequency: 1Hz Config file: XXX Logger Serial: XXX CH Mapping;;XXXC1;XXXC2;XXXC3;XXXC4 CH Offsets in ms;;X;X,X;X;X,X CH Units;;mA;mA;mA;mA Time;msec;Channel1;Channel2;Channel3;Channel4 04:30:00;000; 0.01526;10.67903;10.58366; 0.00000 04:30:01;000; 0.17090;10.68666

Rename columns regex, keep name if no match

阅读更多关于 Rename columns regex, keep name if no match

Read .txt file with Python Pandas - strings and floats

阅读更多关于 Read .txt file with Python Pandas - strings and floats