strsplit | 易学教程

splitting comma separated mixed text and numeric string with strsplit in R

阅读更多关于 splitting comma separated mixed text and numeric string with strsplit in R

问题 I have many strings of the form name1, name2 and name3, 0, 1, 2 or name1, name2, name3 and name4, 0, 1, 2 and would like to split the vector into 4 elements where the first one would be the whole text string of names. The problem is that strsplit doesn't differenciate between text and numbers and split the string into 5 elements in the first case and into 6 elements in the second example. How can I tell R to dynamically skip the text part of the string with variable number of names? 回答1: You

Time Difference between per person between consecutive rows

阅读更多关于 Time Difference between per person between consecutive rows

问题 I have some data which (broadly speaking) consist of following fields: Person TaskID Start_time End_time Alpha 1 'Wed, 18 Oct 2017 10:10:03 GMT' 'Wed. 18 Oct 2017 10:10:36 GMT' Alpha 2 'Wed, 18 Oct 2017 10:11:16 GMT' 'Wed, 18 Oct 2017 10:11:28 GMT' Beta 1 'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:12:49 GMT' Alpha 3 'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:13:13 GMT' Gamma 1 'Fri, 27 Oct 2017 22:57:12 GMT' 'Sat, 28 Oct 2017 02:00:54 GMT' Beta 2 'Wed, 18 Oct 2017 10:13:40

removing duplicate words in a row

阅读更多关于 removing duplicate words in a row

问题 I have a column in table as below Col1 ======================== "No","No","No","No","No" "No","No","No" Yes No "Yes","Yes","Yes","Yes" "Yes","No","Yes", "Yes I am trying to remove duplicate No and Yes and create column like this Col1 ======================== No No Yes No Yes Yes, No I started with kickDuplicates <- c("No","Yes") # create a list of vectors of place names broken <- strsplit(Table1$Col1, ",") # paste each broken vector of place names back together # .......kicking out duplicated

Split column of comma-separated numbers into multiple columns based on value

阅读更多关于 Split column of comma-separated numbers into multiple columns based on value

问题 I have a column f in my dataframe that I would like to spread into multiple columns based on the values in that column. For example: df <- structure(list(f = c(NA, "18,17,10", "12,8", "17,11,6", "18", "12", "12", NA, "17,11", "12")), .Names = "f", row.names = c(NA, 10L), class = "data.frame") df # f # 1 <NA> # 2 18,17,10 # 3 12,8 # 4 17,11,6 # 5 18 # 6 12 # 7 12 # 8 <NA> # 9 17,11 # 10 12 How would I split column f into multiple columns indicating the numbers in the row. I'm interested in

How to split a character vector based on a numeric vector for positions

阅读更多关于 How to split a character vector based on a numeric vector for positions

问题 I would like to split a character vector into substrings based on a second numeric vector for the splitting points vec <- "LAYRVCMTNEGHPWVSLVVQKTRLQISQDPSLNYEYLPTMGLKSFIQASLALLFGKHSQAIVENRVGGVHTVGDSGAFQLGVQFLRAWHKDARIVYIISSQKELHGLVFQDMGFTVYEYSVWDPKKLCMDPDILLNVVEQIPHGCVLVMGNIIDCKLTPSGWAKLMSM" split.points <- c(25, 32, 55, 90, 124) I would like to cut the above character vector at the positions given in the split.points vector into six different substrings. It sounds very simple, but the split

expand.grid when one variable is really two columns

阅读更多关于 expand.grid when one variable is really two columns

问题 I have a data set with districts, counties and years. If a given district/county combination occurs in any year I want that combination to occur in every year. Below are two ways I have figured out to do this. The first approach uses a function to create combinations of district, county and year and only requires six lines of code. The bottom approach uses a combination of paste , expand.grid and strsplit and is much more complex/convoluted. There are probably much more efficient methods than

R: split only when special regex condition doesn't match

阅读更多关于 R: split only when special regex condition doesn't match

问题 How would you split at every and/ERT only when it is not succeded by "/V" inside one word after in: text <- c("faulty and/ERT something/VBN and/ERT else/VHGB and/ERT as/VVFIN and/ERT not else/VHGB propositions one and/ERT two/CDF and/ERT three/ABC") # my try - !doesn't work > strsplit(text, "(?<=and/ERT)\\s(?!./V.)", perl=TRUE) ^^^^ # Exptected return [[1]] [1] "faulty and/ERT something/VBN and/ERT else/VHGB and/ERT as/VVFIN and/ERT" [2] "not else/VHGB propositions one and/ERT" [3] "two/CDF

strsplit in R with a metacharacter

阅读更多关于 strsplit in R with a metacharacter

问题 I have a large amount of data where the delimiter is a backslash. I'm processing it in R and I'm having a hard time finding how to split the string since the backslash is a metacharacter. For example, a string would look like this: 1128\0019\XA5\E2R\366\00=15 and I want to split it along the \ character, but when I run the strsplit command: strsplit(tempStr, "\\") Error in strsplit(tempStr, "\\") : invalid regular expression '\', reason 'Trailing backslash' When I try to used the "fixed"

R: gsub and str_split_fixed in data.tables

阅读更多关于 R: gsub and str_split_fixed in data.tables

问题 I am "converting" from data.frame to data.table I now have a data.table: library(data.table) DT = data.table(ID = c("ab_cd.de","ab_ci.de","fb_cd.de","xy_cd.de")) DT ID 1: ab_cd.de 2: ab_ci.de 3: fb_cd.de 4: xy_cd.de new_DT<- data.table(matrix(ncol = 2)) colnames(new_DT)<- c("test1", "test2") I would like to to first: delete ".de" after every entry and in the next step separate every entry by the underscore and save the output in two new columns. The final output should look like this: test1

Extracting a number of a string of varying lengths [duplicate]

阅读更多关于 Extracting a number of a string of varying lengths [duplicate]

问题 This question already has answers here : Extracting numbers from vectors of strings (11 answers) Closed 3 years ago . Pretend I have a vector: testVector <- c("I have 10 cars", "6 cars", "You have 4 cars", "15 cars") Is there a way to go about parsing this vector, so I can store just the numerical values: 10, 6, 4, 15 If the problem were just "15 cars" and "6 cars", I know how to parse that, but I'm having difficulty with the strings that have text in front too! Any help is greatly