bzip2 | 易学教程

How to protect myself from a gzip or bzip2 bomb?

阅读更多关于 How to protect myself from a gzip or bzip2 bomb?

This is related to the question about zip bombs , but having gzip or bzip2 compression in mind, e.g. a web service accepting .tar.gz files. Python provides a handy tarfile module that is convenient to use, but does not seem to provide protection against zipbombs. In python code using the tarfile module, what would be the most elegant way to detect zip bombs, preferably without duplicating too much logic (e.g. the transparent decompression support) from the tarfile module? And, just to make it a bit less simple: No real files are involved; the input is a file-like object (provided by the web

How to build boost iostreams with gzip and bzip2 support on Windows

阅读更多关于 How to build boost iostreams with gzip and bzip2 support on Windows

问题 How do I build boost 's iostreams library with gzip and bzip2 support? 回答1: I am no expert, but this worked for me. Option 1 (straight from source) Download source files for zlib and for bzip2. Extract the downloads to directories, move directories to somewhere you like. I had to avoid C:\Program Files (x86)\ as I couldn't get it to work with spaces in the directory name, so I created C:\Sys\ and used that. Open a command prompt with elevated privileges (run as administrator), go to your

Python CRC-32 woes

阅读更多关于 Python CRC-32 woes

I'm writing a Python program to extract data from the middle of a 6 GB bz2 file. A bzip2 file is made up of independently decryptable blocks of data, so I only need to find a block (they are delimited by magic bits), then create a temporary one-block bzip2 file from it in memory, and finally pass that to the bz2.decompress function. Easy, no? The bzip2 format has a crc32 checksum for the file at the end. No problem, binascii.crc32 to the rescue. But wait. The data to be checksummed does not necessarily end on a byte boundary, and the crc32 function operates on a whole number of bytes. My plan:

How to compress a directory with libbz2 in C++

阅读更多关于 How to compress a directory with libbz2 in C++

问题 I need to create a tarball of a directory and then compress it with bz2 in C++. Is there any decent tutorial on using libtar and libbz2? 回答1: Okay, I worked up a quick example for you. No error checking and various arbitrary decisions, but it works. libbzip2 has fairly good web documentation. libtar, not so much, but there are manpages in the package, an example, and a documented header file. The below can be built with g++ C++TarBz2.cpp -ltar -lbz2 -o C++TarBz2.exe : #include <sys/types.h>

Utilizing multi core for tar+gzip/bzip compression/decompression

阅读更多关于 Utilizing multi core for tar+gzip/bzip compression/decompression

I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit). I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression. Is there any way I can utilize the unused cores to make it faster? Mark Adler You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz: tar cf - paths-to-archive | pigz > archive.tar.gz By default, pigz uses the number of available cores, or eight if it

Uncompress BZIP2 archive

阅读更多关于 Uncompress BZIP2 archive

I can uncompress zip, gzip, and rar files, but I also need to uncompress bzip2 files as well as unarchive them (.tar). I haven't come across a good library to use. I am using Java along with Maven so ideally, I'd like to include it as a dependency in the POM. What libraries do you recommend? The best option I can see is Apache Commons Compress with this Maven dependency. <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-compress</artifactId> <version>1.0</version> </dependency> From the examples : FileInputStream in = new FileInputStream("archive.tar.bz2");

Best splittable compression for Hadoop input = bz2?

阅读更多关于 Best splittable compression for Hadoop input = bz2?

We've realized a bit too late that archiving our files in GZip format for Hadoop processing isn't such a great idea. GZip isn't splittable, and for reference, here are the problems which I won't repeat: Very basic question about Hadoop and compressed input files Hadoop gzip compressed files Hadoop gzip input file using only one mapper Why can't hadoop split up a large text file and then compress the splits using gzip? My question is: is BZip2 the best archival compression that will allow a single archive file to be processed in parallel by Hadoop? Gzip is definitely not, and from my reading

missing python bz2 module

阅读更多关于 missing python bz2 module

I have installed at my home directory. [spatel@~ dev1]$ /home/spatel/python-2.7.3/bin/python -V Python 2.7.3 I am trying to run one script which required python 2.7.x version, and i am getting missing bz2 error [spatel@~ dev1]$ ./import_logs.py Traceback (most recent call last): File "./import_logs.py", line 13, in <module> import bz2 ImportError: No module named bz2 I have tried to install bz2 module but i got lots of error [spatel@dev1 python-bz2-1.1]$ /home/spatel/python-2.7.3/bin/python setup.py install ... ... ... bz2.c:1765: error: âBZ_FINISH_OKâ undeclared (first use in this function)

Python CRC-32 woes

阅读更多关于 Python CRC-32 woes

问题 I'm writing a Python program to extract data from the middle of a 6 GB bz2 file. A bzip2 file is made up of independently decryptable blocks of data, so I only need to find a block (they are delimited by magic bits), then create a temporary one-block bzip2 file from it in memory, and finally pass that to the bz2.decompress function. Easy, no? The bzip2 format has a crc32 checksum for the file at the end. No problem, binascii.crc32 to the rescue. But wait. The data to be checksummed does not

Uncompress BZIP2 archive

阅读更多关于 Uncompress BZIP2 archive

问题 I can uncompress zip, gzip, and rar files, but I also need to uncompress bzip2 files as well as unarchive them (.tar). I haven't come across a good library to use. I am using Java along with Maven so ideally, I'd like to include it as a dependency in the POM. What libraries do you recommend? 回答1: The best option I can see is Apache Commons Compress with this Maven dependency. <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-compress</artifactId> <version>1.0</version> <