First set up the test data. Note that we have made the columns to be of "character"
class rather than "factor"
by using as.is=TRUE
:
Lines <- "SNP hu_mRNA gene
chr1.111642529 NM_002107 H3F3A
chr1.111642529 NM_005324 H3F3B
chr1.111801684 BC098118
chr1.111925084 NM_020435 GJC2
chr1.11801605 AK027740
chr1.11801605 NM_032849 C13orf33
chr1.151220354 NM_018913 PCDHGA10
chr1.151220354 NM_018918 PCDHGA5"
cat(Lines, "\n", file = "data.txt")
DF <- read.table("data.txt", header = TRUE, na.strings = "", as.is = TRUE)
Now try this aggregate
statement:
> aggregate(. ~ SNP, DF, toString)
SNP hu_mRNA gene
1 chr1.111642529 NM_002107, NM_005324 H3F3A, H3F3B
2 chr1.111925084 NM_020435 GJC2
3 chr1.11801605 NM_032849 C13orf33
4 chr1.151220354 NM_018913, NM_018918 PCDHGA10, PCDHGA5