问题
I have bunch of .csv.bz2
files, which i have to download, extract, and read in R.
I downloaded the file and want to extract it to current working directory, then read it.
unz(filename,filename.csv)
but it does not seem to work. How can I do that?
I heard somewhere that bzfiles can be read directly without decompressing. How can I do that?
回答1:
You can use any of these two commands:
read.csv()
command: with this command you can directly supply your compressed filename containing csv file.read.csv("file.csv.bz2")
read.table()
command: This command is generic version ofread.csv()
command. You can set delimiters and others options thatread.csv()
automatically sets. You don't need to uncompress the file separately. This command does it automatically for you.read.csv("file.csv.bz2", header = TRUE, sep = ",", quote = "\"",...)
回答2:
Like this:
readcsvbz2file <- read.csv(bzfile("file.csv.bz2"))
回答3:
On Linux systems you can make use of the super fast fread
require(data.table)
fread(sprintf("bzcat %s | tr -d '\\000'", "file.csv.bz2"))
Reference: https://gist.github.com/wush978/93c0f96b68f529678e2d
回答4:
Basically, you need to type:
library(R.utils)
bunzip2("dataset.csv.bz2", "dataset.csv", remove = FALSE, skip = TRUE)
dataset <- read.csv("dataset.csv")
See documentation here: bunzip2 {R.utils}.
回答5:
According to read.table description, one can read a compressed file directly.
read.table("file.csv.bz2")
来源:https://stackoverflow.com/questions/25948777/extract-bz2-file-in-r