strsplit

splitting comma separated mixed text and numeric string with strsplit in R

谁都会走 提交于 2019-12-13 06:24:03
问题 I have many strings of the form name1, name2 and name3, 0, 1, 2 or name1, name2, name3 and name4, 0, 1, 2 and would like to split the vector into 4 elements where the first one would be the whole text string of names. The problem is that strsplit doesn't differenciate between text and numbers and split the string into 5 elements in the first case and into 6 elements in the second example. How can I tell R to dynamically skip the text part of the string with variable number of names? 回答1: You

Time Difference between per person between consecutive rows

佐手、 提交于 2019-12-13 03:43:27
问题 I have some data which (broadly speaking) consist of following fields: Person TaskID Start_time End_time Alpha 1 'Wed, 18 Oct 2017 10:10:03 GMT' 'Wed. 18 Oct 2017 10:10:36 GMT' Alpha 2 'Wed, 18 Oct 2017 10:11:16 GMT' 'Wed, 18 Oct 2017 10:11:28 GMT' Beta 1 'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:12:49 GMT' Alpha 3 'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:13:13 GMT' Gamma 1 'Fri, 27 Oct 2017 22:57:12 GMT' 'Sat, 28 Oct 2017 02:00:54 GMT' Beta 2 'Wed, 18 Oct 2017 10:13:40

removing duplicate words in a row

狂风中的少年 提交于 2019-12-13 02:31:24
问题 I have a column in table as below Col1 ======================== "No","No","No","No","No" "No","No","No" Yes No "Yes","Yes","Yes","Yes" "Yes","No","Yes", "Yes I am trying to remove duplicate No and Yes and create column like this Col1 ======================== No No Yes No Yes Yes, No I started with kickDuplicates <- c("No","Yes") # create a list of vectors of place names broken <- strsplit(Table1$Col1, ",") # paste each broken vector of place names back together # .......kicking out duplicated

Split column of comma-separated numbers into multiple columns based on value

江枫思渺然 提交于 2019-12-12 18:28:19
问题 I have a column f in my dataframe that I would like to spread into multiple columns based on the values in that column. For example: df <- structure(list(f = c(NA, "18,17,10", "12,8", "17,11,6", "18", "12", "12", NA, "17,11", "12")), .Names = "f", row.names = c(NA, 10L), class = "data.frame") df # f # 1 <NA> # 2 18,17,10 # 3 12,8 # 4 17,11,6 # 5 18 # 6 12 # 7 12 # 8 <NA> # 9 17,11 # 10 12 How would I split column f into multiple columns indicating the numbers in the row. I'm interested in

How to split a character vector based on a numeric vector for positions

大兔子大兔子 提交于 2019-12-12 16:41:52
问题 I would like to split a character vector into substrings based on a second numeric vector for the splitting points vec <- "LAYRVCMTNEGHPWVSLVVQKTRLQISQDPSLNYEYLPTMGLKSFIQASLALLFGKHSQAIVENRVGGVHTVGDSGAFQLGVQFLRAWHKDARIVYIISSQKELHGLVFQDMGFTVYEYSVWDPKKLCMDPDILLNVVEQIPHGCVLVMGNIIDCKLTPSGWAKLMSM" split.points <- c(25, 32, 55, 90, 124) I would like to cut the above character vector at the positions given in the split.points vector into six different substrings. It sounds very simple, but the split

expand.grid when one variable is really two columns

我与影子孤独终老i 提交于 2019-12-12 15:55:53
问题 I have a data set with districts, counties and years. If a given district/county combination occurs in any year I want that combination to occur in every year. Below are two ways I have figured out to do this. The first approach uses a function to create combinations of district, county and year and only requires six lines of code. The bottom approach uses a combination of paste , expand.grid and strsplit and is much more complex/convoluted. There are probably much more efficient methods than

R: split only when special regex condition doesn't match

泄露秘密 提交于 2019-12-12 11:58:30
问题 How would you split at every and/ERT only when it is not succeded by "/V" inside one word after in: text <- c("faulty and/ERT something/VBN and/ERT else/VHGB and/ERT as/VVFIN and/ERT not else/VHGB propositions one and/ERT two/CDF and/ERT three/ABC") # my try - !doesn't work > strsplit(text, "(?<=and/ERT)\\s(?!./V.)", perl=TRUE) ^^^^ # Exptected return [[1]] [1] "faulty and/ERT something/VBN and/ERT else/VHGB and/ERT as/VVFIN and/ERT" [2] "not else/VHGB propositions one and/ERT" [3] "two/CDF

strsplit in R with a metacharacter

非 Y 不嫁゛ 提交于 2019-12-12 10:53:57
问题 I have a large amount of data where the delimiter is a backslash. I'm processing it in R and I'm having a hard time finding how to split the string since the backslash is a metacharacter. For example, a string would look like this: 1128\0019\XA5\E2R\366\00=15 and I want to split it along the \ character, but when I run the strsplit command: strsplit(tempStr, "\\") Error in strsplit(tempStr, "\\") : invalid regular expression '\', reason 'Trailing backslash' When I try to used the "fixed"

R: gsub and str_split_fixed in data.tables

落爺英雄遲暮 提交于 2019-12-12 04:36:31
问题 I am "converting" from data.frame to data.table I now have a data.table: library(data.table) DT = data.table(ID = c("ab_cd.de","ab_ci.de","fb_cd.de","xy_cd.de")) DT ID 1: ab_cd.de 2: ab_ci.de 3: fb_cd.de 4: xy_cd.de new_DT<- data.table(matrix(ncol = 2)) colnames(new_DT)<- c("test1", "test2") I would like to to first: delete ".de" after every entry and in the next step separate every entry by the underscore and save the output in two new columns. The final output should look like this: test1

Extracting a number of a string of varying lengths [duplicate]

半腔热情 提交于 2019-12-12 01:46:50
问题 This question already has answers here : Extracting numbers from vectors of strings (11 answers) Closed 3 years ago . Pretend I have a vector: testVector <- c("I have 10 cars", "6 cars", "You have 4 cars", "15 cars") Is there a way to go about parsing this vector, so I can store just the numerical values: 10, 6, 4, 15 If the problem were just "15 cars" and "6 cars", I know how to parse that, but I'm having difficulty with the strings that have text in front too! Any help is greatly