dataframe | 易学教程

split columns into multiple columns

阅读更多关于 split columns into multiple columns

问题 i have been struggled with how to split multiple columns into multiple columns using R but with no result, i have tried many tricks on Stackoverflow and it doesn't work. here is my probleme : reactions__001 reactions__002 reactions__003 25 Like 23 Love 15 Like 5 Love 3 Haha 20 Haha 3 Sad 2 Angry now what i am looking for is to split this data frame like this one Like Love Haha Sad Angry 25 23 0 0 0 15 5 3 0 0 0 0 20 3 2 i have tried the str_split_fixed(df$reactions__001, " ", 2) but it gives

Import multiple sheets from excel spreadsheet into r

阅读更多关于 Import multiple sheets from excel spreadsheet into r

问题 I want to import multiple sheets, selected by a common string in the sheet name, from a single .xlsx file and concatenate them into single data frame. For example, if I have an excel file ('data.xlsx') with worksheets named samples1, samples2, samples3, controls1, controls2, controls3. I want to make a list of the worksheet names, such as: sheet_list <- lapply(excel_sheets('data.xlsx'), read_excel, path = 'data.xlsx') Then, I want to import all sheets that contain 'samples' in the name and

How to cut a vector or column into intervals in R [duplicate]

阅读更多关于 How to cut a vector or column into intervals in R [duplicate]

问题 This question already has answers here : Convert continuous numeric values to discrete categories defined by intervals (2 answers) Closed 1 year ago . I have the following columns in a dataframe which difference between each row is 0.012 s : Time 0 0.012 0.024 0.036 0.048 0.060 0.072 0.084 0.096 0.108 I want to come up with intervals starting from beginning increasing by 0.030, so intervals or time window of every 0.03 later to be used in group by. 回答1: You can try findInterval like

How to delete rows where values are in parentheses?

阅读更多关于 How to delete rows where values are in parentheses?

问题 I am working with the following data frame: Name Height Eric 64 (Joe) 67 Mike 66 Nick 72 (Dave) 69 Steve 73 I would like to delete all rows when the 'name' column starts with an open parenthesis "(". So the final data frame would look like: Name Height Eric 64 Mike 66 Nick 72 Steve 73 回答1: In the question the names to be excluded always start with a left parnethesis so if that is the general case use subset and startsWith like this: subset(DF, !startsWith(Name, "(")) ## Name Height ## 1 Eric

How to delete rows where values are in parentheses?

阅读更多关于 How to delete rows where values are in parentheses?

Mean, Median, and mode of a list of values (SCORE) given a certain zip code for every year

阅读更多关于 Mean, Median, and mode of a list of values (SCORE) given a certain zip code for every year

问题 I want to find the mean, median and mode value for each year given a specific ZIP code how can I achieve this, I already read the data from CSV file and convert it to json file and define it as DataFrame my data sample is not limited to the following table it's larger 回答1: Use SciPy.mstats: In [2295]: df.DATE = pd.to_datetime(df.DATE).dt.year In [2291]: import scipy.stats.mstats as mstats In [2313]: def mode(x): ...: return mstats.mode(x, axis=None)[0] ...: In [2314]: df.groupby(['DATE',

Saving a Pandas dataframe in fixed format with different column widths

阅读更多关于 Saving a Pandas dataframe in fixed format with different column widths

问题 I have a pandas dataframe (df) that looks like this: A B C 0 1 10 1234 1 2 20 0 I want to save this dataframe in a fixed format. The fixed format I have in mind has different column width and is as follows: "one space for column A's value then a comma then four spaces for column B's values and a comma and then five spaces for column C's values" Or symbolically: -,----,----- My dataframe above (df) would look like the following in my desired fixed format: 1, 10, 1234 2, 20, 0 How can I write a

Combining columns of dataframe [duplicate]

阅读更多关于 Combining columns of dataframe [duplicate]

问题 This question already has answers here : how to collapse columns in pandas on null values? (6 answers) Closed 7 months ago . I have dataframe like this: c1 c2 c3 0 a NaN NaN 1 NaN b NaN 2 NaN NaN c 3 NaN b NaN 4 a NaN NaN I want to combine these three columns like this : c4 0 a 1 b 2 c 3 b 4 a Here is the code to make the above data frame: a = pd.DataFrame({ 'c1': ['a',np.NaN,np.NaN,np.NaN,'a'], 'c2': [np.NaN,'b',np.NaN,'b',np.NaN], 'c3': [np.NaN,np.NaN,'c',np.NaN,np.NaN] }) 回答1: bfill ing is

How to loop through pandas df column, finding if string contains any string from a separate pandas df column?

阅读更多关于 How to loop through pandas df column, finding if string contains any string from a separate pandas df column?

问题 I have two pandas DataFrames in python. DF A contains a column, which is basically sentence-length strings. |---------------------|------------------| | sentenceCol | other column | |---------------------|------------------| |'this is from france'| 15 | |---------------------|------------------| DF B contains a column that is a list of countries |---------------------|------------------| | country | other column | |---------------------|------------------| |'france' | 33 | |------------------

converting columns to factor over list of dataframes

阅读更多关于 converting columns to factor over list of dataframes

问题 I'm trying to convert several columns in a list of dataframes into factors. I've tried this, but it doesn't seem to convert the columns into factors: factor_cols_REx <- c('GESLACHT','GEVKL','BEROEP') for (i in (1:9)) { dataset_RE10_2014[[i]] <- lapply(dataset_RE10_2014[[i]][factor_cols_REx],factor) dataset_RE10_2015[[i]] <- lapply(dataset_RE10_2015[[i]][factor_cols_REx],factor) } Any ideas on how to fix this? 回答1: Let me know if I understood correctly #DATA dat = list(A = mtcars, B = mtcars)