regex - return all before the second occurrence

前端 未结 4 1412
天命终不由人
天命终不由人 2020-12-08 11:17

Given this string:

DNS000001320_309.0/121.0_t0

How would I return everything before the second occurrence of \"_\"?

DNS0000         


        
相关标签:
4条回答
  • 2020-12-08 11:57

    Personally, I hate regex, so luckily there's a way to do this without them, just by splitting the string:

    > s <- "DNS000001320_309.0/121.0_t0"      
    > paste(strsplit(s,"_")[[1]][1:2],collapse = "_")
    [1] "DNS000001320_309.0/121.0"
    

    Although of course this assumes that there will always be at least 2 underscores in your string, so be careful if you vectorize this and that isn't the case.

    0 讨论(0)
  • 2020-12-08 11:57

    not pretty but this will do the trick

    mystr <- "DNS000001320_309.0/121.0_t0"
    
    mytok <- paste(strsplit(mystr,"_")[[1]][1:2],collapse="_")
    
    0 讨论(0)
  • 2020-12-08 12:00

    The following script:

    s <- "DNS000001320_309.0/121.0_t0"
    t <- gsub("^([^_]*_[^_]*)_.*$", "\\1", s)
    t
    

    will print:

    DNS000001320_309.0/121.0
    

    A quick explanation of the regex:

    ^         # the start of the input
    (         # start group 1
      [^_]*   #   zero or more chars other than `_`
      _       #   a literal `_`
      [^_]*   #   zero or more chars other than `_`
    )         # end group 1
    _         # a literal `_`
    .*        # consume the rest of the string
    $         # the end of the input
    

    which is replaced with:

    \\1       # whatever is matched in group 1
    

    And if there are less than 2 underscores, the string is not changed.

    0 讨论(0)
  • 2020-12-08 12:06

    I think this might do the task (regex to match everything befor the last occurence of _):

    _([^_]*)$
    

    E.g.:

    > sub('_([^_]*)$', '', "DNS000001320_309.0/121.0_t0")
    [1] "DNS000001320_309.0/121.0"
    
    0 讨论(0)
提交回复
热议问题