问题
I'm new to concurrent programming in java.
I need to read, analyze and process an extremely fast growing logfile, so I got to be fast. My idea was to read the file (line by line) and upon matching a relevant line I want to pass those lines to separate threads that can do further processing on the line. I called these threads "IOThread" in the following example code.
My problem is that the BufferedReader readline in IOthread.run() apparently never returns. What is a working way to read the Stream inside the thread? Are there any better approaches than the one below?
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;
class IOThread extends Thread {
private InputStream is;
private int t;
public IOThread(InputStream is, int t) {
this.is = is;
this.t = t;
System.out.println("iothread<" + t + ">.init");
}
public void run() {
try {
System.out.println("iothread<" + t + ">.run");
String line;
BufferedReader streamReader = new BufferedReader(new InputStreamReader(is));
while ((line = streamReader.readLine()) != null) {
System.out.println("iothread<" + t + "> got line " + line);
}
System.out.println("iothread " + t + " end run");
} catch (Exception e) {
e.printStackTrace();
}
}
}
public class Stm {
public Stm(String filePath) {
System.out.println("start");
try {
BufferedReader reader = new BufferedReader(new FileReader(filePath));
PipedOutputStream po1 = new PipedOutputStream();
PipedOutputStream po2 = new PipedOutputStream();
PipedInputStream pi1 = new PipedInputStream(po1);
PipedInputStream pi2 = new PipedInputStream(po2);
IOThread it1 = new IOThread(pi1,1);
IOThread it2 = new IOThread(pi2,2);
it1.start();
it2.start();
// it1.join();
// it2.join();
String line;
while ((line = reader.readLine()) != null) {
System.out.println("got line " + line);
if (line.contains("aaa")) {
System.out.println("passing to thread 1: " + line);
po1.write(line.getBytes());
} else if (line.contains("bbb")) {
System.out.println("passing to thread 2: " + line);
po2.write(line.getBytes());
}
}
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
new Stm(args[0]);
}
}
An example input file would be:
line 1
line 2
line 3 aaa ...
line 4
line 5 bbb ...
line 6 aaa ...
line 7
line 8 bbb ...
line 9 bbb ...
line 10
Call the above code with the filename of the input file as argument.
回答1:
IMHO you have got it backwards. Create multiple threads for "processing" stuff and not for reading data from the file. When reading data from file, you are anyways bottlenecked so having multiple threads won't make any difference. The simplest solution is to read lines as fast as you can in a given thread and store the lines in a shared queue. This queue can then be accessed by any number of threads to do the relevant processing.
This way, you can actually do concurrent processing stuff while the I/O or reader thread is busy reading/waiting for the data. If possible, keep the "logic" to a minimum in the reader thread. Just read those lines and let the worker threads do the real heavy lifting stuff (matching pattern, further processing etc.). Just go with a thread safe queue and you should be kosher.
EDIT: Use some variant of the BlockingQueue, either array based or linked list based.
回答2:
Your reader in your iothread keeps stuck in the head of the first iteration of your while-loop for the following reason: you pass the content of the read line from your STM thread, but you do not append a new line character (\n). Since your buffered reader waits for a new line character (as in .readLine()) it waits forever. You could modify your code like this:
if (line.contains("aaa")) {
System.out.println("passing to thread 1: " + line);
byte[] payload = (line+"\n").getBytes();
po1.write(payload);
} else if (line.contains("bbb")) {
System.out.println("passing to thread 2: " + line);
byte[] payload = (line+"\n").getBytes();
po2.write(payload);
}
But I have to say that this is not at all an elegant solution, you could use a blocking queue or something similar to supply your IOThreads with content. This way you can avoid converting your input to strings to bytes and back to strings (not speaking getting rid of all the streams).
来源:https://stackoverflow.com/questions/12818341/bufferedreader-readline-when-reading-in-a-thread