data-manipulation

Replacing patterns in a string

半城伤御伤魂 提交于 2021-01-29 02:10:19
问题 I have several strings in this format. The separator is a dash ( - ) and each "thing" in between is a marker. string <- "FA-I2-I2-I2-EX-I2-I3-FA-I1-I2-TR-I1-I2-FA-I3-I1-FAFANR-I3-I2-TR-I1-I2-I1-I2-FA-I2-I1-I3-FAQU-I1-I2-I2-I2-NR-I2-I2-NR-I1-I2-I1-NR-I3-QU-I2-I3-QUNR-I2-I1-NRQUQU-I2-I1-EX" I want to identify cases wherever markers containing the letter "I" occurs in a row (i.e. the markers I1, I2, and I3). Then I want to replace those with a description that has no separators. For example, the

how to subset rows in specific columns based on minimum values in individual columns in a dataframe using R

感情迁移 提交于 2021-01-28 20:10:40
问题 we have a data frame that has 1000's of rows with multiple columns. the sample data frame is presented below df1 <- data.frame(X = c(7.48, 7.82, 8.15, 8.47, 8.80, 9.20, 9.51, 9.83, 10.13, 10.59, 7.59, 8.06, 8.39, 8.87, 9.26, 9.64, 10.09, 10.48, 10.88, 11.45), Y = c(49.16, 48.78, 48.40, 48.03, 47.65, 47.24, 46.87, 46.51, 46.15, 45.73, 48.70, 48.18, 47.72, 47.20, 46.71, 46.23, 45.72, 45.24, 44.77, 44.23), ID = c("B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1_2", "B_1

How to remove duplicate comma separated character values from each cell of a column using R

大憨熊 提交于 2021-01-28 19:16:37
问题 I have a data-frame with 2 columns ID and Product as below : ID Product A Clothing, Clothing Food, Furniture, Furniture B Food,Food,Food, Clothing C Food, Clothing, Clothing I need to have only unique products for each ID, for example : ID Product A Clothing, Food, Furniture B Food, Clothing C Food, Clothing How do I do this using R 回答1: If there are multiple delimiters in the dataset, one way would be to split the 'Product' column using all the delimiters, get the unique and then paste it

Is there a way to loop through data based on factor in a column and add up the number of rows?

▼魔方 西西 提交于 2021-01-28 05:08:42
问题 I have some data in which I have multiple observations of the same event. Based on a threshold of time, I want to condense the observations. But I want to know how many I am condensing (i.e. how many observations become one). I'm not sure how to loop through my dataframe in such a way to do that. I've tried writing a for loop, if statements, while statements, and have searched tirelessly on google and on stack overflow. Nothing seems to relate to what I need to do. here is a subset of my data

R - Add row index to a data frame but handle ties with minimum rank

心已入冬 提交于 2021-01-28 03:54:57
问题 I successfully used the answer in this SO thread r-how-to-add-row-index-to-a-data-frame-based-on-combination-of-factors but I need to handle situation where two (or more) rows can be tied. df <- data.frame( season = c(2014,2014,2014,2014,2014,2014, 2014, 2014), week = c(1,1,1,1,2,2,2,2), player.name = c("Matt Ryan","Peyton Manning","Cam Newton","Matthew Stafford","Carson Palmer","Andrew Luck", "Aaron Rodgers", "Chad Henne"), fant.pts.passing = c(28,19,29,28,18,22,29,22) ) df <- df[order(-df

Find value within a range in R

大兔子大兔子 提交于 2020-12-27 05:54:48
问题 My data look like below. I want to select value greater/equal to 35 and less than/equal to 350. I also want to replace those values with withinrange value 1 35 36 37 350 355 3555 35555 回答1: To select the values: value[value >= 35 & value <=350] To replace them with withinrange : value[value >= 35 & value <=350] <- withinrange 来源: https://stackoverflow.com/questions/44954507/find-value-within-a-range-in-r

Find value within a range in R

老子叫甜甜 提交于 2020-12-27 05:52:23
问题 My data look like below. I want to select value greater/equal to 35 and less than/equal to 350. I also want to replace those values with withinrange value 1 35 36 37 350 355 3555 35555 回答1: To select the values: value[value >= 35 & value <=350] To replace them with withinrange : value[value >= 35 & value <=350] <- withinrange 来源: https://stackoverflow.com/questions/44954507/find-value-within-a-range-in-r

R - Obtaining the highest/lowest value in a set of columns defined by the value in a different dataframe

与世无争的帅哥 提交于 2020-12-15 06:02:48
问题 I have two dataframes: one (A) containing the start and end dates (Julian date, so a continuous count of days) of an event, and the other (B) containing values at dates from start to beyond the end dates in the first dataframe. The start date in A is stable, the end date varies. I want to be able to, for each row, identify the value with the greatest magnitude of change (highest and/or lowest values) between the start and end date in the series in B, then write to a new dataframe. Example

R - Obtaining the highest/lowest value in a set of columns defined by the value in a different dataframe

南笙酒味 提交于 2020-12-15 06:02:34
问题 I have two dataframes: one (A) containing the start and end dates (Julian date, so a continuous count of days) of an event, and the other (B) containing values at dates from start to beyond the end dates in the first dataframe. The start date in A is stable, the end date varies. I want to be able to, for each row, identify the value with the greatest magnitude of change (highest and/or lowest values) between the start and end date in the series in B, then write to a new dataframe. Example