Filter rows based on variables “beginning with” strings specified by vector

前端 未结 3 643
梦如初夏
梦如初夏 2021-01-22 08:00

I\'m trying to filter a patient database based on specific ICD9 (diagnosis) codes. I would like to use a vector indicating the first 3 strings of the ICD9 codes.

The exa

3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-22 08:41

    You can use apply and ldply

    library(plyr)
    filtered_obs <- apply(observations, 1, function(x) if(sum(substr(x,1,3) %in% dx)>0){x})
    filtered_obs <- plyr::ldply(filtered_obs,rbind)
    

    If you have variable number of characters then this should work-

    filtered_obs <- lapply(dx, function(y)
                     {
                      plyr::ldply(apply(observations, 1, function(x) 
                       {
                        if(sum(substr(x,1,nchar(y)) %in% y)>0){x}
                       }), rbind)
                     })
    
    filtered_obs <- unique(plyr::ldply(filtered_obs,rbind))
    

提交回复
热议问题