data-manipulation | 易学教程

Replacing patterns in a string

阅读更多关于 Replacing patterns in a string

问题 I have several strings in this format. The separator is a dash ( - ) and each "thing" in between is a marker. string <- "FA-I2-I2-I2-EX-I2-I3-FA-I1-I2-TR-I1-I2-FA-I3-I1-FAFANR-I3-I2-TR-I1-I2-I1-I2-FA-I2-I1-I3-FAQU-I1-I2-I2-I2-NR-I2-I2-NR-I1-I2-I1-NR-I3-QU-I2-I3-QUNR-I2-I1-NRQUQU-I2-I1-EX" I want to identify cases wherever markers containing the letter "I" occurs in a row (i.e. the markers I1, I2, and I3). Then I want to replace those with a description that has no separators. For example, the

how to subset rows in specific columns based on minimum values in individual columns in a dataframe using R

阅读更多关于 how to subset rows in specific columns based on minimum values in individual columns in a dataframe using R

问题 we have a data frame that has 1000's of rows with multiple columns. the sample data frame is presented below df1 <- data.frame(X = c(7.48, 7.82, 8.15, 8.47, 8.80, 9.20, 9.51, 9.83, 10.13, 10.59, 7.59, 8.06, 8.39, 8.87, 9.26, 9.64, 10.09, 10.48, 10.88, 11.45), Y = c(49.16, 48.78, 48.40, 48.03, 47.65, 47.24, 46.87, 46.51, 46.15, 45.73, 48.70, 48.18, 47.72, 47.20, 46.71, 46.23, 45.72, 45.24, 44.77, 44.23), ID = c("B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1", "B_1_2", "B_1

How to remove duplicate comma separated character values from each cell of a column using R

阅读更多关于 How to remove duplicate comma separated character values from each cell of a column using R

问题 I have a data-frame with 2 columns ID and Product as below : ID Product A Clothing, Clothing Food, Furniture, Furniture B Food,Food,Food, Clothing C Food, Clothing, Clothing I need to have only unique products for each ID, for example : ID Product A Clothing, Food, Furniture B Food, Clothing C Food, Clothing How do I do this using R 回答1: If there are multiple delimiters in the dataset, one way would be to split the 'Product' column using all the delimiters, get the unique and then paste it

Is there a way to loop through data based on factor in a column and add up the number of rows?

阅读更多关于 Is there a way to loop through data based on factor in a column and add up the number of rows?

问题 I have some data in which I have multiple observations of the same event. Based on a threshold of time, I want to condense the observations. But I want to know how many I am condensing (i.e. how many observations become one). I'm not sure how to loop through my dataframe in such a way to do that. I've tried writing a for loop, if statements, while statements, and have searched tirelessly on google and on stack overflow. Nothing seems to relate to what I need to do. here is a subset of my data

R - Add row index to a data frame but handle ties with minimum rank

阅读更多关于 R - Add row index to a data frame but handle ties with minimum rank

问题 I successfully used the answer in this SO thread r-how-to-add-row-index-to-a-data-frame-based-on-combination-of-factors but I need to handle situation where two (or more) rows can be tied. df <- data.frame( season = c(2014,2014,2014,2014,2014,2014, 2014, 2014), week = c(1,1,1,1,2,2,2,2), player.name = c("Matt Ryan","Peyton Manning","Cam Newton","Matthew Stafford","Carson Palmer","Andrew Luck", "Aaron Rodgers", "Chad Henne"), fant.pts.passing = c(28,19,29,28,18,22,29,22) ) df <- df[order(-df

Find value within a range in R

阅读更多关于 Find value within a range in R

问题 My data look like below. I want to select value greater/equal to 35 and less than/equal to 350. I also want to replace those values with withinrange value 1 35 36 37 350 355 3555 35555 回答1: To select the values: value[value >= 35 & value <=350] To replace them with withinrange : value[value >= 35 & value <=350] <- withinrange 来源： https://stackoverflow.com/questions/44954507/find-value-within-a-range-in-r

Find value within a range in R

阅读更多关于 Find value within a range in R

R - Obtaining the highest/lowest value in a set of columns defined by the value in a different dataframe

阅读更多关于 R - Obtaining the highest/lowest value in a set of columns defined by the value in a different dataframe

问题 I have two dataframes: one (A) containing the start and end dates (Julian date, so a continuous count of days) of an event, and the other (B) containing values at dates from start to beyond the end dates in the first dataframe. The start date in A is stable, the end date varies. I want to be able to, for each row, identify the value with the greatest magnitude of change (highest and/or lowest values) between the start and end date in the series in B, then write to a new dataframe. Example

R - Obtaining the highest/lowest value in a set of columns defined by the value in a different dataframe

阅读更多关于 R - Obtaining the highest/lowest value in a set of columns defined by the value in a different dataframe

Extract One Set of Gene Coordinates From a File Containing Several Sets of Gene Coordinates

阅读更多关于 Extract One Set of Gene Coordinates From a File Containing Several Sets of Gene Coordinates

来源： https://stackoverflow.com/questions/63623568/extract-one-set-of-gene-coordinates-from-a-file-containing-several-sets-of-gene