Reshaping from long to wide with some missing data (NA's) on time invariant variables

前端 未结 2 1340
死守一世寂寞
死守一世寂寞 2021-01-16 18:13

When using stats:::reshape() from base to convert data from long to wide format, for any variables designated as time invariant, reshape just takes

2条回答
  •  刺人心
    刺人心 (楼主)
    2021-01-16 18:33

    I don't know how to fix the problem but one way to fix the symptom would be to push the NA values down in the order.

    testdata <- testdata[order(testdata$timeinvariant),]
    testdata
    #  id process1 timeinvariant time
    #3  2      3.5             4    1
    #2  1      4.0             6    2
    #1  1      3.0            NA    1
    reshaped<-reshape(testdata,v.names="process1",direction="wide")
    reshaped
    #  id timeinvariant process1.1 process1.2
    #3  2             4        3.5         NA
    #2  1             6        3.0          4
    

    A more general solution will be to make sure there is only one value in the timevariant column per id

    testdata$timeinvariant <- apply(testdata,1,function(x) max(testdata[testdata$id == x[1],"timeinvariant"],na.rm=T))
    testdata
    #  id process1 timeinvariant time
    #3  2      3.5             4    1
    #2  1      4.0             6    2
    #1  1      3.0             6    1
    

    This can be repeated for any number of columns before calling the reshape function. Hope this helps

提交回复
热议问题