Selecting column sequences and creating variables

微笑、不失礼 提交于 2019-12-13 09:15:31

问题


I was wondering if there was a way to select specific columns via a sequence and create new variables from this.

So for example, if I had 8 columns with n observations, how could I create 4 variables that selects 2 rows sequentially? My dataset is much larger than this and I have 1416 variables with 62 observations each (I have pasted a link to the spreadsheet below, whereby the first column and row represent names). I would like to create new dataframes from this named as sites 1-12. So site 1 = df[,1:117]; site 2 = df [,119:237] etc.

I am planning on using this code for future datasets with even more variables so some form of loop or sequence function would be very effective if anyone could shed any light on how to achieve this?

https://www.dropbox.com/s/p1a5cu567lxntmw/MyData.csv?dl=0

Thank you in advance.

James

p.s @nrussell I have copied and pasted the output of the code you mentioned below, it follows on as a series of numbers like those displayed.

dput(z[ , 1:10]) structure(list(1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0.0311410340342049, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0207444023791158, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0312971643732546, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0376287494579976, 0, 0, 0, 0, 0, 0, 0),......... 10 = c(0, 0, 0, 0, 0.119280313679916, 0, 0, 0.301029995663981, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.715681882079494, 0.136831816210901, 0, 0, 0, 0.0273663632421801, 0, 0, 0, 0.0547327264843602, 0, 0, 0, 0, 0.0231561535126139, 0, 0, 0.0903089986991944, 0, 0, 0.0752574989159953, 0.159368821233872, 0.0272640716982664, 0.0177076468037636, 0, 0, 0.120411998265592, 0, 0, 0, 0, 0.0322532138211408, 0.0250858329719984, 0, 0, 0, 0.119280313679916, 0, 0.172922500085254, 0.225772496747986, 0, 0, 0, 0.0954242509439325, 0)), .Names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame", row.names = c(NA, -62L))


回答1:


We could split the dataset ('df') with '1416' columns to equal size '118' columns by creating a grouping index with gl

 lst <- setNames(lapply(split(1:ncol(df), as.numeric(gl(ncol(df), 118,
            ncol(df)))), function(i) df[,i]), paste0('site', 1:12))

Or you can create the 'lst' without using the split

 lst <- setNames(lapply(seq(1, ncol(df), by = 118), 
            function(i) df[i:(i+117)]), paste0('site', 1:12))

If we need to create 12 dataset objects in the global environment, list2env is an option (I would prefer to work within the 'lst' itself)

 list2env(lst, envir=.GlobalEnv)

Using a small dataset ('df1') with '8' columns

  lst1 <- setNames(lapply(split(1:ncol(df1), as.numeric(gl(ncol(df1), 
         2, ncol(df1)))), function(i) df1[,i]), paste0('site', 1:4))
  list2env(lst1, envir=.GlobalEnv)

  head(site1,3)
  #  V1 V2
  #1  6 12
  #2  4  7
  #3 14 14

 head(site4,3)
 #  V7 V8
 #1 10  2
 #2  5  4
 #3  5  0

data

set.seed(24)
df1 <- as.data.frame(matrix(sample(0:20, 8*10, replace=TRUE), ncol=8))


来源:https://stackoverflow.com/questions/29101214/selecting-column-sequences-and-creating-variables

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!