How to calculate a table of pairwise counts from long-form data frame

前端 未结 4 1378
闹比i
闹比i 2020-12-06 20:20

I have a \'long-form\' data frame with columns id (the primary key) and featureCode (categorical variable). Each record has between 1 and 9 values

4条回答
  •  执念已碎
    2020-12-06 21:18

    Here is a data.table approach similar to @mrdwab

    It will work best if featureCode is a character

    library(data.table)
    
    DT <- data.table(dat)
    # convert to character
    DT[, featureCode := as.character(featureCode)]
    # subset those with >1 per id
    DT2 <- DT[, N := .N, by = id][N>1]
    # create all combinations of 2
    # return as a data.table with these as columns `V1` and `V2`
    # then count the numbers in each group
    DT2[, rbindlist(combn(featureCode,2, 
          FUN = function(x) as.data.table(as.list(x)), simplify = F)), 
        by = id][, .N, by = list(V1,V2)]
    
    
         V1   V2 N
    1: PPLC PCLI 3
    2:  PPL PPLC 1
    3:  PPL PCLI 1
    

提交回复
热议问题