Good design: How to pass InputStreams as argument?

问题

I've got a big file on which I'm opening a FileInputStream. This file contains some files each having an offset from the beginning and a size. Furthermore, I've got a parser that should evaluate such a contained file.

File file = ...; // the big file
long offset = 1734; // a contained file's offset
long size = 256; // a contained file's size
FileInputStream fis = new FileInputStream(file );
fis.skip(offset);
parse(fis, size);

public void parse(InputStream is, long size) {
   // parse stream data and insure we don't read more than size bytes
   is.close();
}

I feel like this is no good practice. Is there a better way to do this, maybe using buffering?

Furthermore, I feel like the skip() method slows the reading process a lot.

回答1:

It sounds like what you really want is a sort of "partial" input stream - one a bit like the ZipInputStream, where you've got a stream within a stream.

You could write this yourself, proxying all InputStream methods to the original input stream making suitable adjustments for offset and checking for reading past the end of the subfile.

Is that the sort of thing you're talking about?

回答2:

First, FileInputStream.skip() has a bug which may make the file underneath skip beyond the EOF marker of the file so be wary of that one.

I've personally found working with Input/OutputStreams to be a pain compared to using FileReader and FileWriter and you're showing the main issue I have with them: The need to close the streams after using. One of the issues is that you can never be sure if you've closed up all the resources properly unless you make the code a bit too cautious like this:

public void parse(File in, long size) {
    try {
        FileInputStream fis = new FileInputStream(in);
        // do file content handling here
    } finally {
        fis.close();
    }
    // do parsing here
}

This is of course bad in the sense that this would lead to creating new objects all the time which may end up eating a lot of resources. The good side of this is of course that the stream will get closed even if the file handling code throws an exception.

回答3:

This sounds like a typical nested file aka "zip" file problem.

A common way to handle this is to actually have a separate InputStream instance for each nested logical stream. These would perform the necessary operations on the underlying phsycial stream, and buffering can be both on the underlying stream and the logical stream, depending on which suits best. This means the logical stream encapsulates all the information about placement in the underlying stream.

You could forinstance have a kind of factory method that would have a signature like this:

List<InputStream> getStreams(File inputFile)

You could do the same with OutputStreams.

There are some details to this, but this may be enough for you ?

回答4:

In general, the code that opens the file should close the file -- the parse() function should not close the input stream, since it is of the utmost arrogance for it to assume that the rest of the program won't want to continue reading other files contained in the big one.

You should decide whether the interface to parse() should be just stream and length (with the function able to assume that the file is correctly positioned) or whether the interface should include the offset (so the function first positions and then reads). Both designs are feasible. I'd be inclined to let the parse() do the positioning, but it is not a clear-cut decision.

回答5:

You could use a wrapper class on a RandomAccessFile - try this

You could also try wrapping that in a BufferedInputStream and see if the performance improves.

来源：https://stackoverflow.com/questions/366897/good-design-how-to-pass-inputstreams-as-argument

标签

java

inputstream