Separate values in a column of a dataframe and melt

房东的猫 提交于 2019-12-02 07:25:24

I would suggest my cSplit function for a problem like this. The solution becomes:

cSplit(mydf, "Client.ID", ",", "long")
#     Account.Owner       Account.Name Client.ID
#  1:    Deb Berman     Albertsons LLC      3184
#  2:    Deb Berman     Albertsons LLC      3186
#  3:    Deb Berman     Albertsons LLC      3185
#  4:    Deb Berman     Albertsons LLC      2578
#  5:    Deb Berman        All Recipes       909
#  6:    Deb Berman        All Recipes      4937
#  7:    Liz Madsen   American Express      1230
#  8:    Liz Madsen   American Express      1236
#  9:    Deb Berman  Bed Bath & Beyond      1180
# 10:    Deb Berman  Bed Bath & Beyond      1556
# 11:    Deb Berman           Birchbox       101
# 12:    Deb Berman           Birchbox      1704
# 13:    Deb Berman           Birchbox      5149
# 14:    Deb Berman           Birchbox      5150
# 15:    Deb Berman           Birchbox      5148
# 16:   Jeff Murphy Brown Shoe Company      5402
# 17:   Jeff Murphy Brown Shoe Company      6159
# 18:   Jeff Murphy Brown Shoe Company      6160

The arguments used here are: (1) the data.frame or data.table to work with, (2) the column or columns that need to be split up, (3) the separator, and (4) whether the result should be "wide" or "long".

You may want to modify it as follows: cSplit(mydf, "Client.ID", ", ", "long") or cSplit(mydf, "Client.ID", ",|,\\s+", "long", fixed = FALSE) depending on how nice and clean your "Client.ID" column really is.

This assumes we're starting with the following sample dataset:

mydf <- structure(list(Account.Owner = c("Deb Berman", "Deb Berman", 
   "Liz Madsen", "Deb Berman", "Deb Berman", "Jeff Murphy"), 
   Account.Name = c("Albertsons LLC", "All Recipes", "American Express", 
   "Bed Bath & Beyond", "Birchbox", "Brown Shoe Company"), 
   Client.ID = c("3184, 3186, 3185, 2578", "909, 4937", "1230,1236", 
   "1180, 1556", "101, 1704, 5149, 5150, 5148", "5402, 6159, 6160")), 
   .Names = c("Account.Owner", "Account.Name", "Client.ID"), 
   class = c("data.table", "data.frame"), row.names = c(NA, -6L))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!