How to reshape a dataframe with “reoccurring” columns?

前端 未结 2 1014
甜味超标
甜味超标 2020-12-09 22:28

I am new to data analysis with R. I recently got a pre-formatted environmental observation-model dataset, an example subset of which is shown below:

date             


        
2条回答
  •  难免孤独
    2020-12-09 23:04

    The fact that you have recurring column names is a bit of an oddity and is not normal R behaviour. Most of the time R forces you to have valid names via the make.names() function. Regardless, I'm able to duplicate your problem. Note I made my own example since yours isn't reproducible, but the logic is the same.

    #Do not force unique names
    s <- data.frame(id = 1:3, x = runif(3), x = runif(3), check.names = FALSE)
    #-----
      id         x         x
    1  1 0.6845270 0.5218344
    2  2 0.7662200 0.6179444
    3  3 0.4110043 0.1104774
    
    #Now try to melt, note that 1/2 of your x-values are missing!
    melt(s, id.vars = 1)
    #-----
      id variable     value
    1  1        x 0.6845270
    2  2        x 0.7662200
    3  3        x 0.4110043
    

    The solution is to make your column names unique. As I said before, R does this by default in most cases. However, you can do it after the fact via make.unique()

    names(s) <- make.unique(names(s))
    #-----
    [1] "id"  "x"   "x.1"
    

    Note that the second column of x now has a 1 appended to it. Now melt() works as you'd expect:

    melt(s, id.vars = 1)
    #-----
      id variable     value
    1  1        x 0.6845270
    2  2        x 0.7662200
    3  3        x 0.4110043
    4  1      x.1 0.5218344
    5  2      x.1 0.6179444
    6  3      x.1 0.1104774
    

    At this point, if you want to treat x and x.1 as the same variable, I think a little gsub() or other regex function to get rid of the offending characters. THis is a workflow I use quite often.

提交回复
热议问题