“last name, first name” -> “first name last name” in serialized strings

前端 未结 3 550
慢半拍i
慢半拍i 2021-01-13 00:14

I have a bunch of strings that contain lists of names in last name, first name format, separated by commas, like so:

names <- c(\'Beaufoy         


        
3条回答
  •  感动是毒
    2021-01-13 01:09

    (1) Maintain same names in each element This can be done with a single gsub (assuming there are no commas within names):

    > gsub("([^, ][^,]*), ([^,]+)", "\\2 \\1", names)
    [1] "Simon Beaufoy, Danny Boyle"       "Christopher Nolan"               
    [3] "Stuart Blumberg, Lisa Cholodenko" "David Seidler"                   
    [5] "Aaron Sorkin"    
    
    > gsub("([^, ][^,]*), ([^,]+)", "\\2 \\1", "Hoover, J. Edgar")
    [1] "J. Edgar Hoover"
    

    (2) Separate into one name per element If you wanted each first name last name in a separate element then use (a) scan

    scan(text = out, sep = ",", what = "")
    

    where out is the result of the gsub above or to get it directly try (b) strapply:

    > library(gsubfn)
    > strapply(names, "([^, ][^,]*), ([^,]+)", x + y ~ paste(y, x), simplify = c)
    [1] "Simon Beaufoy"     "Danny Boyle"       "Christopher Nolan"
    [4] "Stuart Blumberg"   "Lisa Cholodenko"   "David Seidler"    
    [7] "Aaron Sorkin"     
    
    > strapply("Hoover, Edgar J.", "([^, ][^,]*), ([^,]+)", x + y ~ paste(y, x), 
    +   simplify = c)
    [1] "Edgar J. Hoover"
    

    Note that all examples above used the same regular expression for matching.

    UPDATE: removed comma separating first and last name.

    UPDATE: added code to separate out each first name last name into a separate element in case that is the preferred output format.

提交回复
热议问题