Can I use a list as a hash in R? If so, why is it so slow?

前端 未结 7 798
遇见更好的自我
遇见更好的自我 2020-11-29 23:43

Before using R, I used quite a bit of Perl. In Perl, I would often use hashes, and lookups of hashes are generally regarded as fast in Perl.

For example, the followi

7条回答
  •  醉话见心
    2020-11-30 00:14

    Your code is very un R-like and is one of the reasons it's so slow. I haven't optimized the code below for maximum speed, only R'ness.

    n <- 10000
    
    keys <- matrix( sample(letters, 3*n, replace = TRUE), nrow = 3 )
    keys <- apply(keys, 2, paste0, collapse = '')
    value <- floor(1000*runif(n))
    testHash <- as.list(value)
    names(testHash) <- keys
    
    keys <- sample(names(testHash), n, replace = TRUE)
    lookupValue = testHash[keys]
    print(data.frame('key', keys, 'lookup', unlist(lookupValue)))
    

    On my machine that runs almost instantaneously excluding the printing. Your code ran about the same speed you reported. Is it doing what you want? You could set n to 10 and just look at the output and testHash and see if that's it.

    NOTE on syntax: The apply above is simply a loop and those are slow in R. The point of those apply family commands is expressiveness. Many of the commands that follow could have been put inside a loop with apply and if it was a for loop that would be the temptation. In R take as much out of your loop as possible. Using apply family commands makes this more natural because the command is designed to represent the application of one function to a list of some sort as opposed to a generic loop (yes, I know apply could be used on more than one command).

提交回复
热议问题