Extract part of string (till the first semicolon) in R

主宰稳场 提交于 2019-11-28 10:17:06

You could try sub

sub(';.*$','', Type)
#[1] "SNSR_RMIN_PSX150Y_CSH"

It will match the pattern i.e. first occurence of ; to the end of the string and replace with ''

Or use

library(stringi)
stri_extract(Type, regex='[^;]*')
#[1] "SNSR_RMIN_PSX150Y_CSH"

The stringi package works very fast here:

stri_extract_first_regex(Type, "^[^;]+")
## [1] "SNSR_RMIN_PSX150Y_CSH"

I benchmarked on the 3 main approaches here:

Unit: milliseconds
      expr       min        lq      mean   median        uq      max neval
  SAPPLY() 254.88442 267.79469 294.12715 277.4518 325.91576 419.6435   100
     SUB() 182.64996 186.26583 192.99277 188.6128 197.17154 237.9886   100
 STRINGI()  89.45826  91.05954  94.11195  91.9424  94.58421 124.4689   100

Here's the code for the Benchmarks:
library(stringi)
SAPPLY <- function() sapply(strsplit(Type, ";"), "[[", 1)
SUB <- function() sub(';.*$','', Type)
STRINGI <- function() stri_extract_first_regex(Type, "^[^;]+")

Type <- c("SNSR_RMIN_PSX150Y_CSH;SP_12;I0.00V50HX0HY3000")
Type <- rep(Type, 100000)

library(microbenchmark)
microbenchmark( 
    SAPPLY(),
    SUB(),
    STRINGI(),
times=100L)

you can also use strsplit

strsplit(Type, ";")[[1]][1]
[1] "SNSR_RMIN_PSX150Y_CSH"
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!