Text Mining R Package & Regex to handle Replace Smart Curly Quotes

前端 未结 3 591
夕颜
夕颜 2020-12-04 00:37

I\'ve got a bunch of texts like this below with different smart quotes - for single and double quotes. All I could end up with the packages I\'m aware of is to remove those

3条回答
  •  不思量自难忘°
    2020-12-04 00:56

    There's a function in {proustr} to normalize punctuation, called pr_normalize_punc() :

    https://github.com/ColinFay/proustr#pr_normalize_punc

    It turns :

     => ″‶«  »“”`´„“ into "
     => ՚ ’ into ' 
     => … into ...
    

    For example :

    library(proustr)
    a <- data.frame(text = "Il l՚a dit : « La ponctuation est chelou » !")
    pr_normalize_punc(a, text)
    # A tibble: 1 x 1
                                                text
    *                                          
    1 "Il l'a dit : \"La ponctuation est chelou\" !"
    

    For your text :

    pr_normalize_punc(data.frame( text = "You don‘t get “your” money’s worth"), text)
    # A tibble: 1 x 1
                                        text
    *                                  
    1 "You don‘t get \"your\" money's worth"
    

提交回复
热议问题