R - create a scatter plot from a number of csv files automatically after filtering unwanted data

此生再无相见时 提交于 2019-12-24 14:34:36

问题


I have attached a sample CSV file which I have 100 of them,

A number of its columns and its 19 top rows should be filtered and rest of data should be used to plot graphs and export graphs in a more usable format.

These data are generated by Netlogo behavior space, and I could not find a good way to visualize the output directly, so what I have after each experiment is many csv files.

data file: https://www.dropbox.com/s/nj243qcs6sx6fu8/Rates.csv sample output : https://www.dropbox.com/s/suomh0vwsfzisj4/SampleGraph.jpg

for example bold columns are the ones I need to plot :

x,y,color,pen down?,x,y,color,pen down?,x,y,color,pen down?,x,y,color,pen down?

thanks ;)


回答1:


Here's an answer for this one plot:

dat <- read.csv('Rates.csv', stringsAsFactors = FALSE, skip = 19)
colnames(dat)[which(names(dat) %in% c("y", "y.1", "y.2", "y.3"))] <- c("Age", "Revenge", "Homicide", "Hunger")

require(reshape2)
tmp <- melt(dat, id.vars = "x", measure.vars = c("Age", "Revenge", "Homicide", "Hunger"))
require(ggplot2)
ggplot(tmp,aes(x, value)) + 
        geom_point(aes(colour = factor(variable))) +
        xlab("time") +
        ylab("units") +
        ggtitle("My CSV file") +
        labs(colour = "my variables")

And here's how you might use it with your 100s of CSV files...

files <- (dir("C:/my-csv-files", recursive=TRUE, full.names=TRUE, pattern="\\.(csv|CSV)$"))
listcsvs <- lapply(files, function(i) read.csv(i,  stringsAsFactors = FALSE, skip = 19))
names(listcsvs) <- files
require(reshape2)
require(ggplot2)
for (i in 1:length(files)) { 
  tmp <- melt(dat, id.vars = "x", measure.vars = c("y", "y.1", "y.2", "y.3"))
  print( ggplot(tmp,aes(x, value)) + 
    geom_point(aes(colour = factor(variable))) +
    xlab("time") +
    ylab("units") +
    ggtitle(names(listcsvs[i])) )
  )
}



回答2:


Without knowing the naming structure of the files, I can't be sure this will capture them as you want to. The following code will take all .csv files in the current directory and plot them, creating a .png of the plot in the same directory, with the same filename as the .csv it comes from (except for the suffix).

# get list of .csv files
files <- dir(".", pattern = "\\.csv", full.names = TRUE, ignore.case = TRUE)
# we'll use ggplot2 for plotting and reshape2 to get the data in shape
library(ggplot2)
library(reshape2)
# loop through the files
for (file in files) {
    # load the data, skipping the first 19 lines
    df <- read.csv(file, as.is=T, skip=19)
    # keep only the columns we want
    df <- df[,c(1,2,6,10,14)]
    # put the names back in 
    names(df) <- c('x','Age','Revenge','Homicide','Hunger')
    # convert to long format
    df <- melt(df, id=c('x'))
    # create the png nam
    name <- gsub(file, pattern='csv', replacement='png')
    # begin png output
    png(name, width=400, height=300)
    # plot
    p <- ggplot(df, aes(x=x, y=value, colour=variable)) +
            # line plot
            geom_line() +
            # use the black and white theme
            theme_bw() +
            xlab('x label') +
            ylab('y label')
    # we have to explicitly print to png
    print(p)
    # finish output to png
    dev.off()
}




回答3:


This should get you started:

Df <- read.csv('https://dl.dropboxusercontent.com/s/nj243qcs6sx6fu8/Rates.csv?dl=1&token_hash=AAEvqZvmuesLhKJSrYHiasj-h0ULrABzbU0Q39bU6FJSCQ', 
           skip=19)

X <- as.matrix(Df[,grep("x",names(Df))])
Y <- as.matrix(Df[,grep("y",names(Df))])

matplot(X, Y, type="l", lty=1)

If you put this into a loop you can produce graphs for all your files.



来源:https://stackoverflow.com/questions/19855261/r-create-a-scatter-plot-from-a-number-of-csv-files-automatically-after-filteri

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!