colnames() function in R - Treating table values as independant objects/variables

这一生的挚爱 提交于 2020-02-25 05:17:49

问题


I have a list of values which I would like to use as names for separate tables scraped from separate URLs on a certain website.

> Fac_table
[[1]]
[1] "fulltime_fac_table"

[[2]]
[1] "parttime_fac_table"

[[3]]
[1] "honorary_fac_table"

[[4]]
[1] "retired_fac_table"

I would like to loop through the list to automatically generate 4 tables with the respective names.

The result should look like this:

> fulltime_fac_table
    職稱          
V1  "教授兼系主任"
V2  "教授"        
V3  "教授"        
V4  "教授"        
V5  "特聘教授"    

> parttime_fac_table
    職稱       姓名    
V1  "教授"     "XXX"
V2  "教授"     "XXX"
V3  "教授"     "XXX"
V4  "教授"     "XXX"
V5  "教授"     "XXX"
V6  "教授"     "XXX"

I have another list, named 'headers', containing column headings of the respective tables online.

> headers
[[1]]
[1] "職稱"             "姓名"             "    研究領域"
[4] "聯絡方式"        

[[2]]
[1] "職稱"     "姓名"     "研究領域" "聯絡方式"

I was able to assign values to the respective tables with this code:

> assign(eval(parse(text="Fac_table[[i]]")), as_tibble(matrix(fac_data,
> nrow = length(headers[[i]])))

This results in a populated table, without column headings, like this one:

> honorary_fac_table
    [,1]       [,2]    
V1  "名譽教授" "XXX"
V2  "名譽教授" "XXX"
V3  "名譽教授" "XXX"
V4  "名譽教授" "XXX"

But was unable to assign column names to each table.

Neither of the code below worked:

> assign(colnames(eval(parse(text="Fac_table[1]"))), c(gsub("\\s", "", headers[[1]])))
Error in assign(colnames(eval(parse(text = "Fac_table[1]"))), c(gsub("\\s",  : 
  第一個引數不正確

> colnames(eval(parse(text="Fac_table[i]"))) <- c(gsub("\\s", "", headers[[i]]))
Error in colnames(eval(parse(text = "Fac_table[i]"))) <- c(gsub("\\s",  : 
  賦值目標擴充到非語言的物件

> do.call("<-", colnames(eval(parse(text="Fac_table[i]"))), c(gsub("\\s", "", headers[[i]])))
Error in do.call("<-", colnames(eval(parse(text = "Fac_table[i]"))), c(gsub("\\s",  : 
  second argument must be a list

To simplify the issue, a reproducible example is as follows:

> varNamelist <- list(c("tbl1","tbl2","tbl3","tbl4"))
> colHeaderlist <- list(c("col1","col2","col3","col4"))
> tableData <- matrix([1:12], ncol=4)

This works:

> assign(eval(parse(text="varNamelist[[1]][1]")), matrix(tableData, ncol
> = length(colHeaderlist[[1]])))

But this doesn't:

> colnames(as.name(varNamelist[[1]][1])) <- colHeaderlist[[1]]
Error in `colnames<-`(`*tmp*`, value = c("col1", "col2", "col3", "col4" : 
  attempt to set 'colnames' on an object with less than two dimensions

It seems like the colnames() function in R is unable to treat the strings as represented by "Fac_table[i]" as variable names, in which independent data (separate from Fac_table) can be stored.

> colnames(as.name(Fac_table[[1]])) <- headers[[1]]
Error in `colnames<-`(`*tmp*`, value = c("a", "b", "c",  : 
  attempt to set 'colnames' on an object with less than two dimensions

Substituting for 'fulltime_fac_table' directly works fine.

> colnames(fulltime_fac_table) <- headers[[1]]

Is there any way around this issue?

Thanks!


回答1:


There is a solution to this, but I think the current set up may be more complex than necessary if I understand correctly. So I'll try to make this task easier.

If you're working with one-dimensional data, I'd recommend using vectors, as they're more appropriate than lists for that purpose. So for this project, I'd begin by storing the names of tables and headers, like this:

varNamelist <- c("tbl1","tbl2","tbl3","tbl4")
colHeaderlist <- c("col1","col2","col3","col4")

It's still difficult to determine what the data format and origin for the input of these table is from your question, but in general, sometimes a data frame can be easier to work with than a matrix, as long as your not working with Big Data. The assign function is also typically not necessary for these sort of steps. Instead, when setting up a dataframe, we can apply the name of the data frame, the name of the columns, and the data contents all at once, like this:

tbl1 <- data.frame("col1"=c(1,2,3),
                   "col2"=c(4,5,6),
                   "col3"=c(7,8,9),
                   "col4"=c(10,11,12))

Again, we're using vectors, noted by the c() instead of list(), to fill each column since each column is it's own single dimension.

To check the output of tbl1, we can then use print():

print(tbl1)

  col1 col2 col3 col4
1    1    4    7   10
2    2    5    8   11
3    3    6    9   12

If it's an option to create the tables closer to this way shown, that might make things easier than using so many lists and assign functions; that quickly becomes overly complicated.

But if you want at the end to store all the tables in a single place, you could put them in a list:

tableList <– list(tbl1=tbl1,tbl2=tbl2,tbl3=tbl3,tbl4=tbl4)

str(tableList)
List of 4
 $ tbl1:'data.frame':   3 obs. of  4 variables:
  ..$ col1: num [1:3] 1 2 3
  ..$ col2: num [1:3] 4 5 6
  ..$ col3: num [1:3] 7 8 9
  ..$ col4: num [1:3] 10 11 12
 $ tbl2:'data.frame':   3 obs. of  4 variables:
  ..$ col1: num [1:3] 1 2 3
  ..$ col2: num [1:3] 4 5 6
  ..$ col3: num [1:3] 7 8 9
  ..$ col4: num [1:3] 10 11 12
 $ tbl3:'data.frame':   3 obs. of  4 variables:
  ..$ col1: num [1:3] 1 2 3
  ..$ col2: num [1:3] 4 5 6
  ..$ col3: num [1:3] 7 8 9
  ..$ col4: num [1:3] 10 11 12
 $ tbl4:'data.frame':   3 obs. of  4 variables:
  ..$ col1: num [1:3] 1 2 3
  ..$ col2: num [1:3] 4 5 6
  ..$ col3: num [1:3] 7 8 9
  ..$ col4: num [1:3] 10 11 12



回答2:


I've found a work around solution based on @Ryan's recommendation, given by this code:

for (i in seq_along(url)){

  webpage <- read_html(url[i]) #loop through URL list to access html data

  fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
  fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
  fac_data <- c(fac_data, fac_data1) #Store table data on each URL in a variable 

  x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data

  for (j in seq_along(headers[[i]])){
    y <- cbind(x[,j]) #extract column data and store in temporary variable
    colnames(y) <- as.character(headers[[i]][j]) #add column name
    print(cbind(y)) #loop through headers list to print column data in sequence. ** cbind(y) will be overwritten when I try to store the result on a list with 'z <- cbind(y)'.
  }
}

I am now able to print out all values, complete with headers of the data in question.

Follow-up questions have been posted here.


The final code solved this problem as well.



来源:https://stackoverflow.com/questions/46059697/colnames-function-in-r-treating-table-values-as-independant-objects-variable

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!