write.table inside a function applied to a list of data frames overwrite outputs

心不动则不痛 提交于 2019-12-29 09:38:10

问题


I almost finish a messy code to apply several statistical methods/test to 11 data frames from different watersheds with physico-chemical parameters as variables. I reach the goal, but I need to do this functional. So to start i made a function to compute correlation, and save the results as .txt tables and .pdf images. It works great when run the function to one dataframe at the time (for that you should import each dataframe separately using read.table, which is not written in the code below). As i want it functional, made a list of the 11 dataframes and use lapply to run the function to each one. It works in the sense that gives me one list (corr) containing the correlation results of each dataframe.

Here comes the issues:

  1. The list cor with correlation results for each dataframe looks like has values instead of data frames, so i dont know how to access or save them (see the corr list in the Environment/Data window). Well, until here, at least looks like correlation results exists somewhere.
  2. The second problem is that when i run corr<-lapply(PQ_data, cor_PQ), which has a line to save the outputs as tables (.txt) and images (.pdf) using part of the name of the original dataframe computed (e.g first element of PQ_data is "AgIX_E_PQ" so table and plot of cor_PQ(PQ_data[["AgIX_E_PQ"]] should get the names "mCorAgIX_E_PQ.txt" and "CorAgIX_E_PQ.pdf" respectively), im getting just one output (mCorX[[I]].txt and CorX[[i]].pdf) with the last dataframe correlation result. That is, tables and images for each dataframe correlation result are overwritten into this generics mCorX[[I]].txt, CorX[[i]].pdf files.

Now i guess have to define 'i' or something to avoid this. Should i define cor_PQ function for PQ_data instead X?

If anyone can see where im failing, i will appreciate any help to solve this, please.

My data: PQ_data /save it in your workspace and fix setwd with it. My code:

rm(list=ls(all=TRUE))
cat("\014")

setwd("C:/Users/Sol/Documents/ProyectoTítulo/CalidadAgua/Matrices/Regs") #my workspace

PQ_files<-list.files(path="C:/Users/Sol/Documents/ProyectoTítulo/CalidadAgua/Matrices/Regs",
                     pattern="\\_PQ.txt") #my list of 14 dataframes in my workspace.
PQ_data<-lapply(PQ_files, read.table) #read tables of the 14 dataframes in the list.
names(PQ_data)<-gsub("\\_PQ.txt","", PQ_files) #name the 14 dataframes with their original names.

#FUNCTION TO COMPUTE CORRELATIONS, SAVE TABLES AND PLOTS.
cor_PQ<-function(X) {
  corPQ<-cor(X, use="pairwise.complete.obs")
  outputname.txt<-paste0("mCor",deparse(substitute(X)),".txt")
  write.table(corPQ, file=outputname.txt)
  outputname.pdf<-paste0("Cor",deparse(substitute(X)),".pdf")
  pdf(outputname.pdf)
  plot(X)
  dev.off()
  return(corPQ)
}

corr<-lapply(PQ_data, cor_PQ)

After this, as i said, a get a list called "corr" with 11 elements containing correlation results from each dataframe in my list (PQ_data), but i cant access them as tables when i pin the "corr" list in my environment/data window (they dont show the blue R arrow to expand the element). ` And i get only 2 output files called mCorX[[I]].txt and CorX[[i]].pdf showing only the last dataframe correlation result because the write.table and .pdf functions overwrite the results of the 10 previous calculations. Again, i will appreciate any help. I really need a push to catch the idea. Thanks!!!


回答1:


lapply doesn't send names of the list to the function. So although the function works for individual files it doesn't work with list of files. Also since there are no names to the files all the files generated are given the same name, hence all the new files overwrite the previously existing files and in the end you get output with only 1 file which is the last element in your list. You can use the below function where we send the names as different parameter to assign the name to the files.

cor_PQ<-function(X, Y) {
   corPQ<-cor(X, use="pairwise.complete.obs")
   outputname.txt<-paste0("mCor",Y,".txt")
   write.table(corPQ, file= outputname.txt)
   outputname.pdf<-paste0("Cor",Y,".pdf")
   pdf(outputname.pdf)
   plot(X)
   dev.off()
   return(corPQ)
}

Now use Map to apply the same function.

Map(cor_PQ, PQ_data, names(PQ_data))

We can also use imap from purrr to apply this function.

purrr::imap(PQ_data, cor_PQ)


来源:https://stackoverflow.com/questions/59376151/write-table-inside-a-function-applied-to-a-list-of-data-frames-overwrite-outputs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!