dataframe

split columns into multiple columns

故事扮演 提交于 2021-02-05 12:18:41
问题 i have been struggled with how to split multiple columns into multiple columns using R but with no result, i have tried many tricks on Stackoverflow and it doesn't work. here is my probleme : reactions__001 reactions__002 reactions__003 25 Like 23 Love 15 Like 5 Love 3 Haha 20 Haha 3 Sad 2 Angry now what i am looking for is to split this data frame like this one Like Love Haha Sad Angry 25 23 0 0 0 15 5 3 0 0 0 0 20 3 2 i have tried the str_split_fixed(df$reactions__001, " ", 2) but it gives

Import multiple sheets from excel spreadsheet into r

谁说我不能喝 提交于 2021-02-05 11:49:58
问题 I want to import multiple sheets, selected by a common string in the sheet name, from a single .xlsx file and concatenate them into single data frame. For example, if I have an excel file ('data.xlsx') with worksheets named samples1, samples2, samples3, controls1, controls2, controls3. I want to make a list of the worksheet names, such as: sheet_list <- lapply(excel_sheets('data.xlsx'), read_excel, path = 'data.xlsx') Then, I want to import all sheets that contain 'samples' in the name and

How to cut a vector or column into intervals in R [duplicate]

放肆的年华 提交于 2021-02-05 11:46:39
问题 This question already has answers here : Convert continuous numeric values to discrete categories defined by intervals (2 answers) Closed 1 year ago . I have the following columns in a dataframe which difference between each row is 0.012 s : Time 0 0.012 0.024 0.036 0.048 0.060 0.072 0.084 0.096 0.108 I want to come up with intervals starting from beginning increasing by 0.030, so intervals or time window of every 0.03 later to be used in group by. 回答1: You can try findInterval like

How to delete rows where values are in parentheses?

笑着哭i 提交于 2021-02-05 11:44:24
问题 I am working with the following data frame: Name Height Eric 64 (Joe) 67 Mike 66 Nick 72 (Dave) 69 Steve 73 I would like to delete all rows when the 'name' column starts with an open parenthesis "(". So the final data frame would look like: Name Height Eric 64 Mike 66 Nick 72 Steve 73 回答1: In the question the names to be excluded always start with a left parnethesis so if that is the general case use subset and startsWith like this: subset(DF, !startsWith(Name, "(")) ## Name Height ## 1 Eric

How to delete rows where values are in parentheses?

只愿长相守 提交于 2021-02-05 11:44:16
问题 I am working with the following data frame: Name Height Eric 64 (Joe) 67 Mike 66 Nick 72 (Dave) 69 Steve 73 I would like to delete all rows when the 'name' column starts with an open parenthesis "(". So the final data frame would look like: Name Height Eric 64 Mike 66 Nick 72 Steve 73 回答1: In the question the names to be excluded always start with a left parnethesis so if that is the general case use subset and startsWith like this: subset(DF, !startsWith(Name, "(")) ## Name Height ## 1 Eric

Mean, Median, and mode of a list of values (SCORE) given a certain zip code for every year

我们两清 提交于 2021-02-05 11:29:28
问题 I want to find the mean, median and mode value for each year given a specific ZIP code how can I achieve this, I already read the data from CSV file and convert it to json file and define it as DataFrame my data sample is not limited to the following table it's larger 回答1: Use SciPy.mstats: In [2295]: df.DATE = pd.to_datetime(df.DATE).dt.year In [2291]: import scipy.stats.mstats as mstats In [2313]: def mode(x): ...: return mstats.mode(x, axis=None)[0] ...: In [2314]: df.groupby(['DATE',

Saving a Pandas dataframe in fixed format with different column widths

佐手、 提交于 2021-02-05 11:29:06
问题 I have a pandas dataframe (df) that looks like this: A B C 0 1 10 1234 1 2 20 0 I want to save this dataframe in a fixed format. The fixed format I have in mind has different column width and is as follows: "one space for column A's value then a comma then four spaces for column B's values and a comma and then five spaces for column C's values" Or symbolically: -,----,----- My dataframe above (df) would look like the following in my desired fixed format: 1, 10, 1234 2, 20, 0 How can I write a

Combining columns of dataframe [duplicate]

南楼画角 提交于 2021-02-05 11:26:05
问题 This question already has answers here : how to collapse columns in pandas on null values? (6 answers) Closed 7 months ago . I have dataframe like this: c1 c2 c3 0 a NaN NaN 1 NaN b NaN 2 NaN NaN c 3 NaN b NaN 4 a NaN NaN I want to combine these three columns like this : c4 0 a 1 b 2 c 3 b 4 a Here is the code to make the above data frame: a = pd.DataFrame({ 'c1': ['a',np.NaN,np.NaN,np.NaN,'a'], 'c2': [np.NaN,'b',np.NaN,'b',np.NaN], 'c3': [np.NaN,np.NaN,'c',np.NaN,np.NaN] }) 回答1: bfill ing is

How to loop through pandas df column, finding if string contains any string from a separate pandas df column?

久未见 提交于 2021-02-05 11:16:31
问题 I have two pandas DataFrames in python. DF A contains a column, which is basically sentence-length strings. |---------------------|------------------| | sentenceCol | other column | |---------------------|------------------| |'this is from france'| 15 | |---------------------|------------------| DF B contains a column that is a list of countries |---------------------|------------------| | country | other column | |---------------------|------------------| |'france' | 33 | |------------------

converting columns to factor over list of dataframes

你说的曾经没有我的故事 提交于 2021-02-05 11:15:33
问题 I'm trying to convert several columns in a list of dataframes into factors. I've tried this, but it doesn't seem to convert the columns into factors: factor_cols_REx <- c('GESLACHT','GEVKL','BEROEP') for (i in (1:9)) { dataset_RE10_2014[[i]] <- lapply(dataset_RE10_2014[[i]][factor_cols_REx],factor) dataset_RE10_2015[[i]] <- lapply(dataset_RE10_2015[[i]][factor_cols_REx],factor) } Any ideas on how to fix this? 回答1: Let me know if I understood correctly #DATA dat = list(A = mtcars, B = mtcars)