strsplit | 易学教程

in R: find all unique values in column separated by comma

阅读更多关于 in R: find all unique values in column separated by comma

问题 I have multiple observations of one species with different observers / groups of observers and want to create a list of all unique observers. My data look like this: data <- read.table(text="species observer 1 A,B 1 A,B 1 B,E 1 B,E 1 D,E,A,C,C 1 F" , header = TRUE, stringsAsFactors = FALSE) My output should return a list of all unique observers - so: A,B,C,E,F I tried to substring the data in column C using the following command but that only returns the unique combinations of observers. all

in R: find all unique values in column separated by comma

阅读更多关于 in R: find all unique values in column separated by comma

Splitting values in different columns in R

阅读更多关于 Splitting values in different columns in R

来源： https://stackoverflow.com/questions/64199683/splitting-values-in-different-columns-in-r

How to split strings into new rows while maintaining other columns in R [duplicate]

阅读更多关于 How to split strings into new rows while maintaining other columns in R [duplicate]

问题 This question already has answers here : Split delimited strings in a column and insert as new rows [duplicate] (6 answers) Closed 2 months ago . I am wanting to split a character vector column into multiple rows (of the same dataframe), while maintaining other columns ( keep ) in this reproducible example: dat<-structure(list(ID = c("E87", "E42", "E39", "E16,E17,E18", "E760,E761,E762"), keep = 1:5), row.names = c(NA, 5L), class = "data.frame") > dat ID keep 1 E87 1 2 E42 2 3 E39 3 4 E16,E17

R: split string into numeric and return the mean as a new column in a data frame

阅读更多关于 R: split string into numeric and return the mean as a new column in a data frame

问题 I have a large data frame with columns that are a character string of numbers such as "1, 2, 3, 4". I wish to add a new column that is the average of these numbers. I have set up the following example: set.seed(2015) library(dplyr) a<-c("1, 2, 3, 4", "2, 4, 6, 8", "3, 6, 9, 12") df<-data.frame(a) df$a <- as.character(df$a) Now I can use strsplit to split the string and return the mean for a given row where the [[1]] specifies the first row. mean(as.numeric(strsplit((df$a), split=", ")[[1]]))

R: split string into numeric and return the mean as a new column in a data frame

阅读更多关于 R: split string into numeric and return the mean as a new column in a data frame

best way to manipulate strings in big data.table

阅读更多关于 best way to manipulate strings in big data.table

问题 I have a 67MM row data.table with people names and surname separated by spaces. I just need to create a new column for each word. Here is an small subset of the data: n <- structure(list(Subscription_Id = c("13.855.231.846.091.000", "11.156.048.529.090.800", "24.940.584.090.830", "242.753.039.111.124", "27.843.782.090.830", "13.773.513.145.090.800", "25.691.374.090.830", "12.236.174.155.090.900", "252.027.904.121.210", "11.136.991.054.110.100" ), Account_Desc = c("AGUAYO CARLA", "LEIVA

R: how to avoid strsplit hiccuping on empty vectors when splitting text

阅读更多关于 R: how to avoid strsplit hiccuping on empty vectors when splitting text

问题 Have a list of text- sections which are required to be split into sentences by: > textList <- list(sections=sections[(length(sections)-2):length(sections)]) > textList$sentences <- sapply(textList$sections, function(x) strsplit(as.character(x), "(?<=und/KON)\\s(?!\\S+/V)|(?<=oder/KON)\\s|(?<=/\\$[[:punct:]])\\s(?!dass/KOUS)(?!dann/ADV)(?!weil/KOUS)", perl=TRUE)) > sent <- textList$sentences The final goal is to add ID s to all sentences and arrange them together into a list of dataframes -

R: how to avoid strsplit hiccuping on empty vectors when splitting text

阅读更多关于 R: how to avoid strsplit hiccuping on empty vectors when splitting text

Create new column with dplyr mutate and substring of existing column

阅读更多关于 Create new column with dplyr mutate and substring of existing column

问题 I have a dataframe with a column of strings and want to extract substrings of those into a new column. Here is some sample code and data showing I want to take the string after the final underscore character in the id column in order to create a new_id column. The id column entry always has 2 underscore characters and it's always the final substring I would like. df = data.frame( id = I(c("abcd_123_ABC","abc_5234_NHYK")), x = c(1.0,2.0) ) require(dplyr) df = df %>% dplyr::mutate(new_id =