R: splitting a numeric string

▼魔方 西西 提交于 2019-12-10 21:09:29

问题


I'm trying to split a numeric string of 40 digits (ie. splitting 123456789123456789123456789 into 1 2 3 4 etc.)

Unfortunately strsplit doesn't work as it requires characters, and converting the string using as.character doesn't work as it is very long and R automatically cuts off decimals for long digits (maximum is 22 decimals). I thus end up with "1.2345e+35" as a character string, instead of the full digit.

Is there a numeric variant of strsplit, or a work around to the decimal-cutting-off issue? I can't seem to find the answer on stackoverflow, but apologies if this has already been answered before. Thanks in advance!


回答1:


If R is calculating the number I do not know the solution. If the number is in a data file I think the code below might work. Although, if the number is in a data file there are probably much easier solutions.

a1 <- read.table("c:/users/Mark W Miller/simple R programs/long_number.txt", colClasses = 'character')

# a1 <- c('1234567891234567891234567891234567891234') ;

a1 <- as.character(a1) ;
a2 <- strsplit(a1, "") ;
a3 <- unlist(a2) ;
a4 <- as.vector(as.numeric(a3)) ;
a4
# [1] 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4

EDIT

I realize I might not understand the question, and my answer is probably pretty silly. Nevertheless, if you have an entire data set of really long numbers you could split all of them with the code below. Note that there are no quotes in the file 'three_long_numbers.txt', and the data start out as numeric:

a1 <- read.table("c:/users/Mark W Miller/simple R programs/three_long_numbers.txt", colClasses = 'character')
a1

#      V1                                        
# [1,] "1234567891234567891234567891234567891234"
# [2,] "1888678912345678912345678912345678912388"
# [3,] "1234999891234567891234567891234567891239"

# a1 <- matrix(c(
# "1234567891234567891234567891234567891234",
# "1888678912345678912345678912345678912388",
# "1234999891234567891234567891234567891239"), nrow=3, byrow=T)

a1 <- as.matrix(a1) ;
a2 <- strsplit(a1, "") ;
a3 <- unlist(a2) ;
a3 <- as.numeric(a3) ;
a4 <- matrix(a3, nrow=dim(a1)[1], byrow=T)
a4



回答2:


You could simply do this to split as numeric vector:

s <- "123456789123456789123456789"
as.numeric(strsplit(s,"")[[1]])

# [1] 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

or if you want them splitted as character vector:

strsplit(s,"")[[1]]

# [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "1" "2" "3" "4" "5" "6" "7" "8" 
# "9" "1" "2" "3" "4" "5" "6"
# [25] "7" "8" "9"



回答3:


Here is another approach that seems more straight-forward than my answer from a year ago:

Split a single vector:

a1 <- c('1234567891234567891234567891234567891234')
a2 <- read.fwf(textConnection(a1), widths=rep(1, nchar(a1)), colClasses = 'numeric', header=FALSE)
a2
  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40
1  1  2  3  4  5  6  7  8  9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4

Read a file containing the following three long numbers of equal length:

# 1234567891234567891234567891234567891234
# 1888678912345678912345678912345678912388
# 1234999891234567891234567891234567891239

a1 <- read.table("c:/users/mmiller21/simple R programs/three_long_numbers.txt", colClasses = 'character', header = FALSE)
a2 <- read.fwf("c:/users/mmiller21/simple R programs/three_long_numbers.txt", widths=rep(1, max(nchar(a1$V1))), colClasses = 'numeric', header=FALSE)
a2

  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40
1  1  2  3  4  5  6  7  8  9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4
2  1  8  8  8  6  7  8  9  1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   8   8
3  1  2  3  4  9  9  9  8  9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   9

Read a file containing the following three long numbers of unequal length:

# 1234567891234567891234567891234567891234
# 188867891234567891234567891234567891238
# 12349998912345678912345678912345678912

a1 <- read.table("c:/users/mmiller21/simple R programs/three_long_numbersb.txt", colClasses = 'character', header = FALSE)
a2 <- read.fwf("c:/users/mmiller21/simple R programs/three_long_numbersb.txt", widths=rep(1, max(nchar(a1$V1))), colClasses = 'numeric', header=FALSE)
a2

  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40
1  1  2  3  4  5  6  7  8  9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4
2  1  8  8  8  6  7  8  9  1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   8  NA
3  1  2  3  4  9  9  9  8  9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2  NA  NA

Here is code to split one column of long numbers in a data file that contains multiple columns. In this example each number in column 2 has the same length:

# -10 1234567891234567891234567891234567891234 -100
# -20 1888678912345678912345678912345678912388 -200
# -30 1234999891234567891234567891234567891239 -300

a1 <- read.table("c:/users/mark w miller/simple R programs/three_long_numbers_Oct25_2013.txt", colClasses = c('numeric', 'character', 'numeric'), header = FALSE)
a2 <- read.fwf(textConnection(a1$V2), widths=rep(1, nchar(a1$V2)[1]), colClasses = 'numeric', header=FALSE)
  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40
1  1  2  3  4  5  6  7  8  9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4
2  1  8  8  8  6  7  8  9  1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   8   8
3  1  2  3  4  9  9  9  8  9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   4   5   6   7   8   9   1   2   3   9


来源:https://stackoverflow.com/questions/10871525/r-splitting-a-numeric-string

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!