How to remove specific special characters in R

前端 未结 3 394
别那么骄傲
别那么骄傲 2020-12-15 08:11

I have some sentences like this one.

c = \"In Acid-base reaction (page[4]), why does it create water and not H+?\" 

I want to remove all sp

相关标签:
3条回答
  • 2020-12-15 08:53

    In order to get your method to work, you need to put the literal "]" immediately after the leading "["

     gsub("[][!#$%()*,.:;<=>@^_`|~.{}]", "", c)
    [1] "In Acid-base reaction page4 why does it create water and not H+?"
    

    You can them put the inner "[" anywhere. If you needed to exclude minus, it would then need to be last. See the ?regex page after all of those special pre-defined character classes are listed.

    0 讨论(0)
  • 2020-12-15 08:57
    gsub("[^[:alnum:][:blank:]+?&/\\-]", "", c)
    # [1] "In Acid-base reaction page4 why does it create water and not H+?"
    
    0 讨论(0)
  • 2020-12-15 08:59

    I think you're after a regex solution. I'll give you a messy solution and a package add on solution (shameless self promotion).

    There's likely a better regex:

    x <- "In Acid-base reaction (page[4]), why does it create water and not H+?" 
    keeps <- c("+", "-", "?")
    
    ## Regex solution
    gsub(paste0(".*?($|'|", paste(paste0("\\", 
        keeps), collapse = "|"), "|[^[:punct:]]).*?"), "\\1", x)
    
    #qdap: addon package solution
    library(qdap)
    strip(x, keeps, lower = FALSE)
    
    ## [1] "In Acid-base reaction page why does it create water and not H+?"
    
    0 讨论(0)
提交回复
热议问题