Is there a good radixsort-implementation for floats in C#

匿名 (未验证) 提交于 2019-12-03 02:49:01

问题:

I have a datastructure with a field of the float-type. A collection of these structures needs to be sorted by the value of the float. Is there a radix-sort implementation for this.

If there isn't, is there a fast way to access the exponent, the sign and the mantissa. Because if you sort the floats first on mantissa, exponent, and on exponent the last time. You sort floats in O(n).

回答1:

Update:

I was quite interested in this topic, so I sat down and implemented it (using this very fast and memory conservative implementation). I also read this one (thanks celion) and found out that you even dont have to split the floats into mantissa and exponent to sort it. You just have to take the bits one-to-one and perform an int sort. You just have to care about the negative values, that have to be inversely put in front of the positive ones at the end of the algorithm (I made that in one step with the last iteration of the algorithm to save some cpu time).

So heres my float radixsort:

public static float[] RadixSort(this float[] array) {     // temporary array and the array of converted floats to ints     int[] t = new int[array.Length];     int[] a = new int[array.Length];     for (int i = 0; i < array.Length; i++)         a[i] = BitConverter.ToInt32(BitConverter.GetBytes(array[i]), 0);      // set the group length to 1, 2, 4, 8 or 16     // and see which one is quicker     int groupLength = 4;     int bitLength = 32;      // counting and prefix arrays     // (dimension is 2^r, the number of possible values of a r-bit number)      int[] count = new int[1 << groupLength];     int[] pref = new int[1 << groupLength];     int groups = bitLength / groupLength;     int mask = (1 << groupLength) - 1;     int negatives = 0, positives = 0;      for (int c = 0, shift = 0; c < groups; c++, shift += groupLength)     {         // reset count array          for (int j = 0; j < count.Length; j++)             count[j] = 0;          // counting elements of the c-th group          for (int i = 0; i < a.Length; i++)         {             count[(a[i] >> shift) & mask]++;              // additionally count all negative              // values in first round             if (c == 0 && a[i] < 0)                 negatives++;         }         if (c == 0) positives = a.Length - negatives;          // calculating prefixes         pref[0] = 0;         for (int i = 1; i < count.Length; i++)             pref[i] = pref[i - 1] + count[i - 1];          // from a[] to t[] elements ordered by c-th group          for (int i = 0; i < a.Length; i++){             // Get the right index to sort the number in             int index = pref[(a[i] >> shift) & mask]++;              if (c == groups - 1)             {                 // We're in the last (most significant) group, if the                 // number is negative, order them inversely in front                 // of the array, pushing positive ones back.                 if (a[i] < 0)                     index = positives - (index - negatives) - 1;                 else                     index += negatives;             }             t[index] = a[i];         }          // a[]=t[] and start again until the last group          t.CopyTo(a, 0);     }      // Convert back the ints to the float array     float[] ret = new float[a.Length];     for (int i = 0; i < a.Length; i++)         ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);      return ret; } 

It is slightly slower than an int radix sort, because of the array copying at the beginning and end of the function, where the floats are bitwise copied to ints and back. The whole function nevertheless is again O(n). In any case much faster than sorting 3 times in a row like you proposed. I dont see much room for optimizations anymore, but if anyone does: feel free to tell me.

To sort descending change this line at the very end:

ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0); 

to this:

ret[a.Length - i - 1] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0); 

Measuring:

I set up some short test, containing all special cases of floats (NaN, +/-Inf, Min/Max value, 0) and random numbers. It sorts exactly the same order as Linq or Array.Sort sorts floats:

NaN -> -Inf -> Min -> Negative Nums -> 0 -> Positive Nums -> Max -> +Inf 

So i ran a test with a huge array of 10M numbers:

float[] test = new float[10000000]; Random rnd = new Random(); for (int i = 0; i < test.Length; i++) {     byte[] buffer = new byte[4];     rnd.NextBytes(buffer);     float rndfloat = BitConverter.ToSingle(buffer, 0);     switch(i){         case 0: { test[i] = float.MaxValue; break; }         case 1: { test[i] = float.MinValue; break; }         case 2: { test[i] = float.NaN; break; }         case 3: { test[i] = float.NegativeInfinity; break; }         case 4: { test[i] = float.PositiveInfinity; break; }         case 5: { test[i] = 0f; break; }         default: { test[i] = test[i] = rndfloat; break; }     } } 

And stopped the time of the different sorting algorithms:

Stopwatch sw = new Stopwatch(); sw.Start();  float[] sorted1 = test.RadixSort();  sw.Stop(); Console.WriteLine(string.Format("RadixSort: {0}", sw.Elapsed)); sw.Reset(); sw.Start();  float[] sorted2 = test.OrderBy(x => x).ToArray();  sw.Stop(); Console.WriteLine(string.Format("Linq OrderBy: {0}", sw.Elapsed)); sw.Reset(); sw.Start();  Array.Sort(test); float[] sorted3 = test;  sw.Stop(); Console.WriteLine(string.Format("Array.Sort: {0}", sw.Elapsed)); 

And the output was (update: now ran with release build, not debug):

RadixSort: 00:00:03.9902332 Linq OrderBy: 00:00:17.4983272 Array.Sort: 00:00:03.1536785 

roughly more than four times as fast as Linq. That is not bad. But still not yet that fast as Array.Sort, but also not that much worse. But i was really surprised by this one: I expected it to be slightly slower than Linq on very small arrays. But then I ran a test with just 20 elements:

RadixSort: 00:00:00.0012944 Linq OrderBy: 00:00:00.0072271 Array.Sort: 00:00:00.0002979 

and even this time my Radixsort is quicker than Linq, but way slower than array sort. :)

Update 2:

I made some more measurements and found out some interesting things: longer group length constants mean less iterations and more memory usage. If you use a group length of 16 bits (only 2 iterations), you have a huge memory overhead when sorting small arrays, but you can beat Array.Sort if it comes to arrays larger than about 100k elements, even if not very much. The charts axes are both logarithmized:

comparison chart http://daubmeier.de/philip/stackoverflow/radixsort_vs_arraysort.png



回答2:

I think your best bet if the values aren't too close and there's a reasonable precision requirement, you can just use the actual float digits before and after the decimal point to do the sorting.

For example, you can just use the first 4 decimals (be they 0 or not) to do the sorting.



回答3:

There's a nice explanation of how to perform radix sort on floats here: http://www.codercorner.com/RadixSortRevisited.htm

If all your values are positive, you can get away with using the binary representation; the link explains how to handle negative values.



回答4:

You can use an unsafe block to memcpy or alias a float * to a uint * to extract the bits.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!