How to delete everything after nth delimiter in R?

后端 未结 2 1050
情深已故
情深已故 2020-11-29 10:49

I have this vector myvec. I want to remove everything after second \':\' and get the result. How do I remove the string after nth \':\'?

myvec&l         


        
2条回答
  •  北荒
    北荒 (楼主)
    2020-11-29 11:27

    We can use sub. We match one or more characters that are not : from the start of the string (^([^:]+) followed by a :, followed by one more characters not a : ([^:]+), place it in a capture group i.e. within the parentheses. We replace by the capture group (\\1) in the replacement.

    sub('^([^:]+:[^:]+).*', '\\1', myvec)
    #[1] "chr2:213403244" "chr7:55240586"  "chr7:55241607" 
    

    The above works for the example posted. For general cases to remove after the nth delimiter,

    n <- 2
    pat <- paste0('^([^:]+(?::[^:]+){',n-1,'}).*')
    sub(pat, '\\1', myvec)
    #[1] "chr2:213403244" "chr7:55240586"  "chr7:55241607" 
    

    Checking with a different 'n'

    n <- 3
    

    and repeating the same steps

    sub(pat, '\\1', myvec)
    #[1] "chr2:213403244:213403244" "chr7:55240586:55240586"  
    #[3] "chr7:55241607:55241607"  
    

    Or another option would be to split by : and then paste the n number of components together.

    n <- 2
    vapply(strsplit(myvec, ':'), function(x)
                paste(x[seq.int(n)], collapse=':'), character(1L))
    #[1] "chr2:213403244" "chr7:55240586"  "chr7:55241607" 
    

提交回复
热议问题