I have a large data.frame of character data that I want to convert based on what is commonly called a dictionary in other languages.
Currently I am going about it li
map = setNames(c("0101", "0102", "0103"), c("AA", "AC", "AG"))
foo[] <- map[unlist(foo)]
assuming that map covers all the cases in foo. This would feel less like a 'hack' and be more efficient in both space and time if foo were a matrix (of character()), then
matrix(map[foo], nrow=nrow(foo), dimnames=dimnames(foo))
Both matrix and data frame variants run afoul of R's 2^31-1 limit on vector size when there are millions of SNPs and thousands of samples.