Do I need a concurrent collection for adding elements to a list by many threads?

眉间皱痕 提交于 2019-12-21 17:48:26

问题


static final Collection<String> FILES = new ArrayList<String>(1);    

for (final String s : list) {
    new Thread(new Runnable() {
        public void run() {
            List<String> file2List = getFileAsList(s);
            FILES.addAll(file2List);
        }
    }).start();
}

This collections gets very big, but the code works perfect. I thought I will get a concurrent modifcation exception, because the FILES list has to extend its size, but it has never happened.

is this code 100% threadsafe ?

The code takes a 12 seconds to load up and a few threads are adding elements at the same time.

I tried to first create thread and later run them, but I got same results (both time and correctness)


回答1:


No, the code is not thread-safe. It may or may not throw a ConcurrentModificationException, but you may end up with elements missing or elements being added twice. Changing the list to be a

Collection<String> FILES = Collections.synchronizedList(new ArrayList<String>());

might already be a solution, assuming that the most time-consuming part is the getFilesAsList method (and not adding the resulting elements to the FILES list).

BTW, an aside: When getFileAsList is accessing the hard-drive, you should perform detailed performance tests. Multi-threaded hard-drive accesses may be slower than a single-threaded one, because the hard drive head might have to jump around the drive and not be able to read data as a contiguous block.


EDIT: In response to the comment: This program will "very likely" produce ArrayIndexOutOfBoundsExceptions from time to time:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.List;

public class ConcurrentListTest
{
    public static void main(String[] args) throws InterruptedException
    {
        for (int i=0; i<1000; i++)
        {
            runTest();
        }
    }

    private static void runTest() throws InterruptedException
    {
        final Collection<String> FILES = new ArrayList<String>(1);
        // With this, it will always work:
        // final Collection<String> FILES = Collections.synchronizedList(new ArrayList<String>(1));

        List<String> list = Arrays.asList("A", "B", "C", "D");
        List<Thread> threads = new ArrayList<Thread>();
        for (final String s : list)
        {
            Thread thread = new Thread(new Runnable()
            {
                @Override
                public void run()
                {
                    List<String> file2List = getFileAsList(s);
                    FILES.addAll(file2List);
                }
            });
            threads.add(thread);
            thread.start();
        }
        for (Thread thread : threads)
        {
            thread.join();
        }
        System.out.println(FILES.size());
    }

    private static List<String> getFileAsList(String s)
    {
        List<String> list = Collections.nCopies(10000, s);
        return list;
    }

}

But of course, there is no strict guarantee that it will. If it does not create such an exception for you, you should consider playing the lottery, because you must be remarkably lucky.




回答2:


Yes, you need a concurrent list to prevent a ConcurrentModificationException.

Here are some ways to initialize a concurrent list in Java:

Collections.newSetFromMap(new ConcurrentHashMap<>());
Collections.synchronizedList(new ArrayList<Object>());
new CopyOnWriteArrayList<>();



回答3:


is this code 100% threadsafe ?

This code is 0% threadsafe, even by the weakest standard of interleaved operation. You are mutating shared state under a data race.

You most definitely need some kind of concurrent control; it is not obvious whether a concurrent collection is the right choice, though. A simple synchronizedList might fit the bill even better because you have a lot of processing and then a quick transfer to the accumulator list. The lock will not be contended much.




回答4:


It is not thread safety at all even if you only add elements. In case that you only increase your FILES there is also some multi access problem if your collection is not thread safe.

When collection exceeds their size it has to be copied to new space and in that moment you can have problems with concurrent access, because in the moment that one thread will do the copy stuff... another can be trying add at the same time element to that collection, resizing is done by internal arraylist implementation but it is not thread-safe at all.

Check that code and lets assume that more than one thread execute it when collection capacity is full.

 private int size; //it is not thread safe value!
 public boolean add(E e) {
             ensureCapacityInternal(size + 1);  // Increments modCount!!
             elementData[size++] = e;   //size is not volatile value it might be cached by thread
             return true;
         }

  public void add(int index, E element) {
        rangeCheckForAdd(index);

        ensureCapacityInternal(size + 1);  // Increments modCount!!
       System.arraycopy(elementData, index, elementData, index + 1,
                        size - index);
        elementData[index] = element;
        size++;
    }


public void ensureCapacity(int minCapacity) {
        if (minCapacity > 0)
            ensureCapacityInternal(minCapacity);
    }

    private void ensureCapacityInternal(int minCapacity) {
        modCount++;
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
     }

In the moment that collection need to exceed its capacity and copy existed element to new internal array you can have real problems with multi-access, also size is not thread because it is not volatile and it can be cached by some thread and you can have overrides :) and this is answer why it might be not thread safe even if you use only use add operation on non synchronized collection.

You should consider using FILES=new CopyOnWriteArrayList();, orFILES= Collections.synchronizedList(new ArrayList()); where add operation is thread-safe.



来源:https://stackoverflow.com/questions/22528306/do-i-need-a-concurrent-collection-for-adding-elements-to-a-list-by-many-threads

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!