Remove all text before colon

后端 未结 9 1174
天命终不由人
天命终不由人 2020-11-27 12:45

I have a file containing a certain number of lines. Each line looks like this:

TF_list_to_test10004/Nus_k0.345_t0.1_         


        
相关标签:
9条回答
  • 2020-11-27 13:29

    A simple regular expression used with gsub():

    x <- "TF_list_to_test10004/Nus_k0.345_t0.1_e0.1.adj:PKMYT1"
    gsub(".*:", "", x)
    "PKMYT1"
    

    See ?regex or ?gsub for more help.

    0 讨论(0)
  • 2020-11-27 13:31

    You can use awk like this:

    awk -F: '{print $2}' /your/file
    
    0 讨论(0)
  • 2020-11-27 13:31

    Some very simple move that I missed from the best response @Sacha Epskamp was to use the sub function, in this case to take everything before the ":"(instead of removing it), so it was very simple:

    foo <- "TF_list_to_test10004/Nus_k0.345_t0.1_e0.1.adj:PKMYT1"
    
    # 1st, as she did to remove all before and up to ":":
    gsub(".*:","",foo)
    
    # 2nd, to keep everything before and up to ":": 
    gsub(":.*","",foo)
    

    Basically, the same thing, just change the ":" position inside the sub argument. Hope it will help.

    0 讨论(0)
  • 2020-11-27 13:38

    Below are 2 equivalent solutions:

    The first uses perl's -a autosplit feature to split each line into fields using :, populate the F fields array, and print the 2nd field $F[1] (counted starting from field 0)

    perl -F: -lane 'print $F[1]' file
    

    The second uses a regular expression to substitute s/// from ^ the beginning of the line, .*: any characters ending with a colon, with nothing

    perl -pe 's/^.*://' file
    
    0 讨论(0)
  • 2020-11-27 13:39

    There are certainly more than 2 ways in R. Here's another.

    unlist(lapply(strsplit(foo, ':', fixed = TRUE), '[', 2))
    

    If the string has a constant length I imagine substr would be faster than this or regex methods.

    0 讨论(0)
  • 2020-11-27 13:40

    I was working on a similar issue. John's and Josh O'Brien's advice did the trick. I started with this tibble:

    library(dplyr)
    my_tibble <- tibble(Col1=c("ABC:Content","BCDE:MoreContent","FG:Conent:with:colons"))
    

    It looks like:

      | Col1 
    1 | ABC:Content 
    2 | BCDE:MoreContent 
    3 | FG:Content:with:colons
    

    I needed to create this tibble:

      | Col1                  | Col2 | Col3 
    1 | ABC:Content           | ABC  | Content 
    2 | BCDE:MoreContent      | BCDE | MoreContent 
    3 | FG:Content:with:colons| FG   | Content:with:colons
    

    And did so with this code (R version 3.4.2).

    my_tibble2 <- mutate(my_tibble
            ,Col2 = unlist(lapply(strsplit(Col1, ':',fixed = TRUE), '[', 1))
            ,Col3 = gsub("^[^:]*:", "", Col1))
    
    0 讨论(0)
提交回复
热议问题