How to programmatically create binary columns based on a categorical variable in data.table?

后端 未结 3 836
無奈伤痛
無奈伤痛 2020-12-18 11:00

I have a big (12 million rows) data.table which looks like this:

library(data.table)
set.seed(123)
dt <         


        
3条回答
  •  暖寄归人
    2020-12-18 11:44

    If you already know the range of the rows (as in you know that there are no more than 3 rows in your example) and you know the columns you can start with an array of zeros and use the apply function to update values in that secondary table.

    My R is a little rust but i think that should work. Additionally the function you pass to the apply method could contain conditions to add necessary rows and columns as is needed.

    My R is a little rust so I'm a bit tentative to write it up right now, but I think that's the way to do it.

    If you are looking for something a little more plug and play I found this little blerb:

    There are two sets of methods that are explained below:
    
    gather() and spread() from the tidyr package. This is a newer interface to the reshape2 package.
    
    melt() and dcast() from the reshape2 package.
    
    There are a number of other methods which aren’t covered here, since they are not as easy to use:
    
    The reshape() function, which is confusingly not part of the reshape2 package; it is part of the base install of R.
    
    stack() and unstack()
    

    from here :: http://www.cookbook-r.com/Manipulating_data/Converting_data_between_wide_and_long_format/

    If I was better versed in R I would tell you how those various methods handle collisions going from long lists to wide on. I was googling up "Make a table from flat data in R" to come up with this...

    Also Check out this It's that same website as above with my personal comment wrapper : p

提交回复
热议问题