问题
I have a csv file with 1.5 million rows which consists of 2 columns name and email.I want to write a program in such way that when I read my file in R, the output is segmented of 5000 data in each csv.
Maybe I can do this with a loop: run from row 1 to 5000 and save it as project1.csv and then 5001 to 10000 and save to project2.csv and then 10001 till 15000 in project3.csv in my working directory. Any suggestions?
回答1:
Assuming that 'df1' is the data.frame which we need to segment every 5000 rows and save it in a new file, we split the dataset by creating a grouping index based on the sequence of rows to a list (lst). We loop through the sequence of list elements (lapply(...), and write new file with write.csv.
n <- 5000
lst <- split(df1, ((seq_len(nrow(df1)))-1)%/%n+1L)
invisible(lapply(seq_along(lst), function(i)
write.csv(lst[[i]], file=paste0('project', i, '.csv'), row.names=FALSE)))
回答2:
An answer using purrr and readr
n <- 5000
split(df1, ((seq_len(nrow(df1)))-1)%/%n+1L) %>%
purrr::iwalk(., ~ readr::write_csv(.x, paste0("project", .y, ".csv")))
来源:https://stackoverflow.com/questions/33054315/write-multiple-csv-files-in-a-loop