问题

I've been searching for justification as for why you should not call a thread's start method inside a constructor for a class. Consider the following code:

class SomeClass
{
    public ImportantData data = null;
    public Thread t = null;

    public SomeClass(ImportantData d)
    {
        t = new MyOperationThread();

        // t.start(); // Footnote 1

        data = d;

        t.start();    // Footnote 2
    }
}

ImportantData is some generic box of stuff (presumably important) and MyOperationThread is a subclass of thread that knows how to handle SomeClass instances.

Footnodes:

I totally understand why this is unsafe. If the MyOperationThread tries to access SomeClass.data before the following statement finishes (and data is initialized) I'll get an exception that I was otherwise unprepared for. Or maybe I won't. You can't always tell with threads. In any case, I'm setting myself up for weird, unexpected behavior later.
I don't understand why doing it this way is forbidden territory. At this point, all of SomeClass' members have been initialized, no other member functions that change state have been called, and construction is thus effectively finished.

From what I understand, the reason it's considered bad practice to do this is that you can "leak a reference to an object that has not yet been fully constructed." But the object has been fully constructed, the constructor has nothing left to do but return. I have searched other questions looking for a more concrete answer to this question, and have looked into referenced material as well, but haven't found anything that says "you shouldn't because such and such undesirable behavior," only things that say "you shouldn't."

How would starting a thread in the constructor be conceptually different from this situation:

class SomeClass
{
    public ImportantData data = null;

    public SomeClass(ImportantData d)
    {
        // OtherClass.someExternalOperation(this); // Not a good idea

        data = d;

        OtherClass.someExternalOperation(this);    // Usually accepted as OK
    }
}

As another aside, what if the class was final?

final class SomeClass // like this
{
    ...

I saw plenty of questions asking about this and answers that you shouldn't, but none offered explanations, so I figured I'd try to add one that has a few more details.

回答1:

But the object has been fully constructed, the constructor has nothing left to do but return

Yes and no. The problem is that according to the Java memory model, the compiler is able to reorder the constructor operations and actually finish the constructor of the object after the constructor finishes. volatile or final fields will be guaranteed to be initialized before the constructor finishes but there is no guarantee that (for example) your ImportantData data field will be properly initialized by the time the constructor finishes.

However as @meriton pointed out in comments, there is a happens before relationship with a thread and the thread that started it. In the case of #2, you are fine because data has to be assigned fully before the thread is started. This is guaranteed according to the Java memory model.

That said, it is considered bad practice to "leak" a reference to an object in its constructor to another thread because if any constructor lines were added after the t.start() it would be a race condition if the thread would see the object full constructed or not.

Here's some more reading:

Here's a good question to read: calling thread.start() within its own constructor
Doug Lea's memory model page talks about instruction reordering and constructors.
Here's a great piece about safe constructor practices which talks about this more.
This is the reason why there are problems with the "double check locking" problem as well.
My answer to this question is relevant: Is this a safe publication of object?

回答2:

Rational, Fact-Oriented Argument Against This Practice

Consider the following situation. You have a class which runs a scheduler thread that queues up tasks to a database, coded in a similar way to the following:

class DBEventManager
{
    private Thread t;
    private Database db;
    private LinkedBlockingQueue<MyEvent> eventqueue;

    public DBEventManager()
    {
        this("127.0.0.1:31337");
    }

    public DBEventManager(String hostname)
    {
        db = new OracleDatabase(hostname);
        t = new DBJanitor(this);

        eventqueue = new LinkedBlockingQueue<MyEvent>();
        eventqueue.put(new MyEvent("Hello Database!"));

        t.start();
    }

    // getters for db and eventqueue
}

Database is some sort of database abstraction interface, MyEvents are generated by anything that needs to signal a change to the database, and DBJanitor is a subclass of Thread that knows how to apply MyEvents to a Database. As we can see, this implementation uses the made up OracleDatabase class as the Database implementation.

That's all well and good, but now your project requirements have changed. Your new plugin must be able to use the existing codebase but must also be able to connect to a Microsoft Access database. You decide to solve this with a subclass:

class AccessDBEventManager extends DBEventManager()
{
    public AccessDBEventManager(String filename)
    {
        super();
        db = new MSAccessDatabase(filename);
    }
}

Lo and behold however, our decision to start the thread in the constructor is coming back to haunt us now. Running on the client's crappy 700MHz single core pentium II, this code now has a race condition: once every few startups, creating a database manager will, as it creates databases and starts threads, send the "Hello Database!" event to the wrong database.

This happens because the thread gets started at the end of the superclass constructor... but that's not the end of the constructing, we're still being initialized by the subclass constructor, which overrides some of the superclass's members, so when the thread jumps right in dispatching events to the database, it occasionally gets in before the subclass constructor updates the database reference to the correct database.

There are at least 2 solutions to this:

You can make your class final, which will prevent subclassing it. If you do this, you can ensure with certainty that your object is fully constructed before exposing it to any other objects (even if it hasn't left the constructor yet), thus ensuring with certainty that odd behavior like this won't happen.

You must also take steps to prevent reordering of assignments in your constructor: you can either declare the fields the thread will access as volatile, or you can wrap them in a synchronized block of any sort. Each of these two options applies additional restrictions on what reordering can be done by the JIT compiler, which ensures that the fields will have been properly assigned when the thread accesses them.

In this situation, you would probably argue with your boss until he let you make a change to the codebase, which would involve changing the constructors of DBEventManager to look something like this:
```
private Thread t; // no getter, doesn't need to be volatile
private volatile Database db;
private volatile LinkedBlockingQueue<MyEvent> eventqueue;

public DBEventManager()
{
    this("127.0.0.1:31337");
}

public DBEventManager(String hostname)
{
    this(new OracleDatabase(hostname));
}

public DBEventManager(Database newdb)
{
    db = newdb;
    t = new DBJanitor(this);

    eventqueue = new LinkedBlockingQueue<MyEvent>();
    eventqueue.put(new MyEvent("Hello Database!"));

    t.start();
}
```
If you'd forseen this problem earlier in development, you might have added the extra constructor back then. You can then make your Microsoft Access utilizing DBEventManager safely with DBEventManager(new MSAccessDatabase("somefile.db"));
You can just not do it, and fall back on the otherwise generally accepted method of using a seperate start method and an optional static factory method or methods that call a constructor and then the start method, like such:

    public start()
    {
        t.start();
    }

    public static DBEventManager getInstance(String hostname)
    {
        DBEventManager dbem = new DBEventManager(hostname);
        dbem.start();
        return DBEventManager;
    }

I'm fairly sure I'm sane, but a second opinion would be nice.

来源：https://stackoverflow.com/questions/11834173/why-shouldnt-i-use-thread-start-in-the-constructor-of-my-class

标签

java