Concurrent sorting in Java

萝らか妹 提交于 2020-01-02 06:07:47

问题


I am currently working on a program to sort strings concurrently. My program takes in a file, reads each line of the file into an array, and splits the array of strings into smaller arrays of strings. The program then starts up one thread for each of the smaller arrays, and quicksorts them. Once every thread has finished sorting its array, the main thread gathers all the results from the thread objects. It is then supposed to merge the smaller, now sorted, arrays into one large, sorted array.

I know for a fact that my quicksort implementation works -- using one thread the program sorts the words. What I need is an algorithm to nest together the arrays returned by the threads.

Any help is appreciated -- thanks in advance.


回答1:


Start from the final merge procedure of mergesort. You read the first value of each of your m arrays (minimum of the single subarray), then you pick the minimum of the m read values (global minimum), push it in the result and and remove it from the containing array or increment the respective index by one. Then, iterate until all subarrays are empty, or all indexes have reached the end of the respective arrays.

NOTE: This may reduce memory usage if you have a really large dataset (it is actually used to handle such situations), but may perform worse than raw Quicksort beacause of the split cost (which becomes linear if you copy over the subarrays) and the multithreading overhead. Consider that inplace Mergesort is more space-efficient when applied to large arrays. Consider also that who wrote the Quicksort you are using probably spent time optimizing the calls and branch execution.

This is basic theoretical CS, but note that you cannot lower the computational complexity class simply by using parallelism, you only get a linear acceleration. Finally, Quicksort happens to hit the lower limit of average complexity for comparision-sorting algorithms: if you are trying to outperform the Quicksort O(nlog(n)) I have bad news for you.




回答2:


I think using a merge sort is pretty standard.

I suggest using as many thread as you have CPUs to start with.

You might find that read the file is a high percentage of the time so something which can sort the strings as you read them might be faster.

e.g. a radix sort with TreeSets might be faster as it will be sorted by the time you have read the file.




回答3:


You can use merge procedure here. The algorithm is quite simple, see Merge sort on wikipedia. Use can use a simple two-way merge when two arrays are merged or a multiway merging when several arrays are merged simultaneously.

Also, check this work: Parallelized QuickSort and RadixSort with Optimal Speedup.

Finally, there is also 3-way string quicksort that can be paralleled.




回答4:


Like mentioned in other post, the last step in your algorithm is a mergesort.

However, quicksort itself is a recursive algorithm and allows for a natural introduction of concurrency such that your "merge step" is obsolete, see e.g., http://ricardozuasti.com/2012/java-concurrency-examples-forkjoin-framework/

After the pivot element is in its final position, you call a quick-sort on the two partitions. This can be done concurrently. Since this is recursive it will span other threads.



来源:https://stackoverflow.com/questions/16398338/concurrent-sorting-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!