Extract bz2 file in R

…衆ロ難τιáo~ 提交于 2019-11-26 19:35:13

问题


I have bunch of .csv.bz2 files, which i have to download, extract, and read in R. I downloaded the file and want to extract it to current working directory, then read it. unz(filename,filename.csv) but it does not seem to work. How can I do that?

I heard somewhere that bzfiles can be read directly without decompressing. How can I do that?


回答1:


You can use any of these two commands:

  1. read.csv()command: with this command you can directly supply your compressed filename containing csv file.

    read.csv("file.csv.bz2")

  2. read.table() command: This command is generic version of read.csv() command. You can set delimiters and others options that read.csv() automatically sets. You don't need to uncompress the file separately. This command does it automatically for you.

    read.csv("file.csv.bz2", header = TRUE, sep = ",", quote = "\"",...)




回答2:


Like this:

readcsvbz2file <- read.csv(bzfile("file.csv.bz2"))



回答3:


On Linux systems you can make use of the super fast fread

require(data.table)
fread(sprintf("bzcat %s | tr -d '\\000'", "file.csv.bz2"))

Reference: https://gist.github.com/wush978/93c0f96b68f529678e2d




回答4:


Basically, you need to type:

library(R.utils)
bunzip2("dataset.csv.bz2", "dataset.csv", remove = FALSE, skip = TRUE)

dataset <- read.csv("dataset.csv")

See documentation here: bunzip2 {R.utils}.




回答5:


According to read.table description, one can read a compressed file directly.

read.table("file.csv.bz2")


来源:https://stackoverflow.com/questions/25948777/extract-bz2-file-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!