问题
I have a string in some text of the form "12,34,77", including the quotation marks.
I need to get the values of each of those numbers into a list. I tried using lapply and strsplit:
control2=lapply(strsplit(data$values,","),as.numeric)
but I get the error:
non character argument
What am I doing wrong?
回答1:
1) strapply
1a) scalar Here is a one-liner using strapply from the gsubfn package:
library(gsubfn)
x <- '"12,34,567"'
strapply(x, "\\d+", as.numeric, simplify = c)
## [1] 12 34 567
1b) vectorized A vectorized version is even simpler -- just remove the simplify=c like this:
v <- c('"1,2,3"', '"8,9"') # test data
strapply(v, "\\d+", as.numeric)`
2) gsub and scan
2a) scalar and here is a one-linear using gsub and scan:
scan(text = gsub('"', '', x), what = 0, sep = ",")
## Read 3 items
## [1] 12 34 567
2b) vectorized A vectorized version would involve lapply-ing over the components:
lapply(v, function(x) scan(text = gsub('"', '', x), what = 0, sep = ","))
3) strsplit
3a) scalar and here is a strsplit solution. Note that we split on both " and , :
as.numeric(strsplit(x, '[",]')[[1]][-1])
## [1] 12 34 567
3b) vectorized A vectorized solution would, again, involve lapply-ing over the components:
lapply(v, function(x) as.numeric(strsplit(x, '[",]')[[1]][-1]))
3c) vectorized - simpler or slightly simpler:
lapply(strsplit(gsub('"', '', v), split = ","), as.numeric)
回答2:
I think your problem may stem from your source data. In any case, if you want to work with numbers, you will have get rid of quotes. I recommend gsub.
> x <- '"1,3,5"'
> x
[1] "\"1,3,5\""
> x <- gsub("\"", "", x)
> x
[1] "1,3,5"
> as.numeric(unlist(strsplit(x, ",")))
[1] 1 3 5
回答3:
Try this:
x <- "12,34,77"
sapply(strsplit(x, ",")[[1]], as.numeric, USE.NAMES=FALSE)
[1] 12 34 77
Since the result of strsplit() is a list of lists, you need to extract the first element and pass this to lapply().
If, however, your string really containst embedded quotes, you need to remove the embedded quotes first. You can use gsub() for this:
x <- '"12,34,77"'
sapply(strsplit(gsub('"', '', x), ",")[[1]], as.numeric, USE.NAMES=FALSE)
[1] 12 34 77
回答4:
As has already been pointed out, you need to regex out the quotation marks first.
The destring function in the taRifx library will do that (remove any non-numeric characters) and then coerce to numeric:
test <- '"12,34,77"'
library(taRifx)
lapply(strsplit(test,","),destring)
[[1]]
[1] 12 34 77
来源:https://stackoverflow.com/questions/11473960/strsplit-and-lapply