问题
I have a fairly large BZ2 file that with several text files in it. Is it possible for me to use Java to uncompress certain files inside the BZ2 file and uncompress/parse the data on the fly? Let's say that a 300mb BZ2 file contains 1 GB of text. Ideally, I'd like my java program to say read 1 mb of the BZ2 file, uncompress it on the fly, act on it and keep reading the BZ2 file for more data. Is that possible?
Thanks
回答1:
The commons-compress library from apache is pretty good. Here's their samples page: http://commons.apache.org/proper/commons-compress/examples.html
Here's the latest maven snippet:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.10</version>
</dependency>
And here's my util method:
public static BufferedReader getBufferedReaderForCompressedFile(String fileIn) throws FileNotFoundException, CompressorException {
FileInputStream fin = new FileInputStream(fileIn);
BufferedInputStream bis = new BufferedInputStream(fin);
CompressorInputStream input = new CompressorStreamFactory().createCompressorInputStream(bis);
BufferedReader br2 = new BufferedReader(new InputStreamReader(input));
return br2;
}
回答2:
The Ant project contains a bzip2 library. Which has a org.apache.tools.bzip2.CBZip2InputStream
class. You can use this class to decompress the bzip2 file on the fly - it just extends the standard Java InputStream class.
来源:https://stackoverflow.com/questions/4834721/java-read-bz2-file-and-uncompress-parse-on-the-fly