Reading Input Stream and splitting based on a delimiter

问题

I have a scenario where I will get a large data as an input stream, which is going to have a delimiter and split it and process them. I want to process , this completely in memory , if its possible. Right now I am achieving this with the help of scanner as shown below , in the code:

package chap5_questions;

import java.util.Scanner;

public class paintjob_chp5 {

    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileNotFoundException;

    public class ScannerTest {
        public static void main(String[] args) {
            FileInputStream fin = null;
            try {
                fin = new FileInputStream(new File("E:\\Project\\Journalling\\docs\\readFile.txt"));

            } catch (FileNotFoundException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            java.util.Scanner scanner = new java.util.Scanner(fin, "UTF-8").useDelimiter("--AABBCCDDEEFFGGHHIIaabbccdd");
            String theString = null;

            while (scanner.hasNext()) {
                theString = scanner.next();
                System.out.println(theString);
                functionToProcessStreams(theString); // This will actually do the processing.

            }

            scanner.close();
        }
    }
}

However, I am not sure, if this is the most efficient way to do this. Another thing that comes to mind, is to use the read(b, off, len) function on inputstream, and then process each of the bytearray. However, for this I need to know , the index of the delimiters , which might again be reading the entire stream.

Please, suggest if there is any better way to do this.

回答1:

Using Scanner with useDelimiter() is efficient: it uses a (constructed) regular expression and will read your input only once.

On a side note: Even if it would cost a bit of efficiency, it is always a good idea to use legible code. This will allow you adapt your code faster and you will make less mistakes. Premature optimalization is the root of all evil.

来源：https://stackoverflow.com/questions/32047997/reading-input-stream-and-splitting-based-on-a-delimiter

标签

java

inputstream

outputstream