Safe publication of array/collection/map contents written once

我的梦境 提交于 2020-01-16 08:08:05

问题


Main question:
What is the best way to perform safe publication of the contents of an array, collection, or map in Java?

Here is what I've tried and some side questions I have:


Side question #1

On thread 1, I'm writing to an HashMap:

Map<K, V> map = new HashMap<>();
map.put(key, value);

My current understanding is that:

  • Collections.unmodifiableMap() does not constitute safe publication of the references stored in the map.

  • By examining these particular implementations of Map.of(), the result of:

    • Map.of() could be documented as thread-safe.
    • Map.of(key, value) could be documented as thread-safe.
    • Map.of(key1, value1, key2, value2) and beyond could not be documented as thread-safe.

Am I wrong?


Side question #2

Now, I want to safely publicate the contents of an HashMap which is written to before it is read from. My current understanding is that using ConcurrentHashMap would incur a performance penalty, therefore I would prefer to avoid using it in a scenario like this one.

I came up with this alternative:

class MapSafePublication<K, V> {
    private final Map<K, V> map = new HashMap<>();

    private final ThreadLocal<Map<K, V>> safeMap = ThreadLocal.withInitial(() -> {
        synchronized (MapSafePublication.this) {
            return new HashMap<>(map);
        }
    });

    synchronized void write(K key, V value) {
        map.put(key, value);
    }

    V read(K key) {
        return safeMap.get().get(key);
    }
}

Is it correct? Are there better ways to do this?


Side question #3

On thread 1, I'm writing to a volatile array:

volatile Object[] array = new Object[size];
array[index] = value;

My current understanding is that reading array[index] from another thread isn't safe. To make it safe, one has to use AtomicReferenceArray.

Here is an excerpt from CopyOnWriteArrayList:

final transient Object lock = new Object();
private transient volatile Object[] array;

public E set(int index, E element) {
    synchronized (lock) {
        Object[] es = getArray();
        E oldValue = elementAt(es, index);

        if (oldValue != element) {
            es = es.clone();
            es[index] = element;
        }
        // Ensure volatile write semantics even when oldvalue == element
        setArray(es);
        return oldValue;
    }
}

public E get(int index) {
    return elementAt(getArray(), index);
}

@SuppressWarnings("unchecked")
static <E> E elementAt(Object[] a, int index) {
    return (E) a[index];
}

How can get() read from the array safely, if it doesn't acquire the same lock that was used by set()?

The same applies to the result of com.google.common.collect.ImmutableList.copyOf(), which as per my current understanding, does not constitute safe publication of the references stored in or returned by the parameter object. Am I wrong?


Clarification #1

From a memory visibility perspective, which of the following are safe?

class Test<K, V> {
    private final Test1<K, V> test1;
    private Test2<K, V> test2;

    Test(Test1<K, V> test1, Test2<K, V> test2) {
        this.test1 = test1;
        this.test2 = test2;
    }

    void test(K key) {
        System.out.println(test1.read(key));
        System.out.println(test2.read(key));
    }
}

class Test1<K, V> {
    private Map<K, V> map = new HashMap<>();

    void write(K key, V value) {
        map.put(key, value);
    }

    V read(K key) {
        return map.get(key);
    }
}

class Test2<K, V> {
    private final Map<K, V> map = new HashMap<>();

    void write(K key, V value) {
        map.put(key, value);
    }

    V read(K key) {
        return map.get(key);
    }
}

// Thread 1
Test1<K, V> test1 = new Test1<>();
test1.write(key, value);
Test2<K, V> test2 = new Test2<>();
test2.write(key, value);

// Thread 2
Test<K, V> test = new Test<>(test1, test2);
test.test(key);

Clarification #2:

Consider the following example:

// Thread 1
Object obj = new Object();
volatile Object object = obj;

// Thread 2
System.out.println(object);

// Thread 1 or 3
object = obj;

// Thread 2
System.out.println(object);

So in the latter case, there is no happens-before relationship between the reader and writer threads?


回答1:


To answer your first part:

Collections.unmodifiableMap() does not constitute safe publication of the references stored in the map.

Correct.

By examining the implementation of Map.of() ...

The only place to examine is the Javadoc. If it does not describe any thread safety guarantees, there are no reliable guarantees. The implementations could be changed either to make things "thread safe", or to remove thread safety.


To answer your third part:

How can get() read from the array safely, if it doesn't acquire the same lock that was used by set()?

The semantics of volatile are such that a write to a volatile field happens before a read of that volatile field.

So, this simplified sequence of actions in set:

Object[] es = getArray() /* volatile read of array (not especially relevant) */;
es = es.clone();
es[index] = element;
setArray(es);  // Internally does `array = es;`, so a volatile write.

happens before this, in get.

return elementAt(getArray() /* volatile read of array */, index);

As such, the updated array element is visible to threads calling get.

The synchronized isn't really relevant to that. that's really there to ensure that two threads aren't updating the array at the same time; this creates its own happens-before (between multiple invocations of set), but these are separate from the happens-before between the volatile write and read.




回答2:


There is no “best way to perform safe publication”, as the decision for a way of publication depends on the actual use case involving publication.

So it’s not correct to say that Collections.unmodifiableMap(…) is not a safe publication. This method isn’t a publication at all.

When a thread is potentially modifying data after an object’s publication, i.e. when another thread may already processing the data, there is no safe publication at all.

A ConcurrentHashMap may solve this issue not because it makes the publication of the map safe, but because each modification is a safe publication on its own. This still only makes its use thread safe if the using threads can cope with the fact that no consistent overall state of the map exists when there are modifications while the map is processed, but only individual mappings are consistent. And, each key or value must not be modified after publication.

When you obey the rule that you must not modify the object(s) after publication, there are tons of possibilities of correct publication. Consider:

HashMap<String, List<Integer>> map = new HashMap<>();
List<Integer> l = new ArrayList<>();
l.add(42);
map.put("foo", l);

Thread t = new Thread() {
    @Override
    public void run() {
        System.out.println(map);
    }
};
t.start();
System.out.println(map);
t.join();

There is a happens-before relationship between the call t.start() and any action performed by the thread. This is sufficient for a safe publication of the HashMap and the contained ArrayList without any additional effort—as long as we obey the “no modification after publication” rule. The fact that both threads may read the map concurrently doesn’t matter.

So when you have code like

volatile Object[] array = new Object[size];
array[index] = value;

you are right, it is not safe because it violates the “no modification after publication” rule.

But the CopyOnWriteArrayList is different, when you look at the posted code closely:

public E set(int index, E element) {
    synchronized (lock) {
        Object[] es = getArray();
        E oldValue = elementAt(es, index);

        if (oldValue != element) {
            es = es.clone(); // <- create a local copy
            es[index] = element; // <- modifies the copy that has not been published
        }
        // Ensure volatile write semantics even when oldvalue == element
        setArray(es); // <- publishes the copy
        return oldValue;
    }
}

So there’s no violation of the “no modification after publication” rule, as modifications are always performed an local copies before publishing them and the volatile writes of an entirely new array reference are establishing a happens-before relationship with subsequent volatile reads of this reference. The synchronized only exists to make multiple modifications consistent.

Note that the code you’ve posted differs in some regard from the this OpenJDK implementation:

public E set(int index, E element) {
    synchronized (lock) {
        Object[] es = getArray();
        E oldValue = elementAt(es, index);

        if (oldValue != element) {
            es = es.clone();
            es[index] = element;
            setArray(es);
        }
        return oldValue;
    }
}

While both work without problems when being used correctly, the difference

        // Ensure volatile write semantics even when oldvalue == element
        setArray(es); // invoked even when oldvalue == element

shows a wrong mindset exactly hitting the discussed rule. When oldvalue == element, the element being set was already in the list, in other words, already published. So if the element was modified before this redundant set call, it would violate the “no modification after publication” rule and performing another volatile write here wouldn’t fix that. On the other hand, if no modification was made, the volatile write would be obsolete. So there is no reason to perform the volatile write when oldvalue == element.

This may help you evaluating your ThreadLocal approach. There is no point in creating a new copy for every thread. Since these copies are never modified, they can be read by an arbitrary number of threads safely. But since these snapshots are never modified, no thread will ever notice changes that are made to the original map after its snapshot has been created. If changes happen rarely and you have a lot of readers, a similar approach to CopyOnWriteArrayList would work.



来源:https://stackoverflow.com/questions/59646146/safe-publication-of-array-collection-map-contents-written-once

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!