Sorting mixed numbers and strings

不打扰是莪最后的温柔 提交于 2019-11-27 14:03:44
Nathan Baulch

Perhaps you could go with a more generic approach and use a natural sorting algorithm such as the C# implementation here.

Two ways come to mind, not sure which is more performant. Implement a custom IComparer:

class MyComparer : IComparer<string>
{
    public int Compare(string x, string y)
    {
        int xVal, yVal;
        var xIsVal = int.TryParse( x, out xVal );
        var yIsVal = int.TryParse( y, out yVal );

        if (xIsVal && yIsVal)   // both are numbers...
            return xVal.CompareTo(yVal);
        if (!xIsVal && !yIsVal) // both are strings...
            return x.CompareTo(y);
        if (xIsVal)             // x is a number, sort first
            return -1;
        return 1;               // x is a string, sort last
    }
}

var input = new[] {"a", "1", "10", "b", "2", "c"};
var e = input.OrderBy( s => s, new MyComparer() );

Or, split the sequence into numbers and non-numbers, then sort each subgroup, finally join the sorted results; something like:

var input = new[] {"a", "1", "10", "b", "2", "c"};

var result = input.Where( s => s.All( x => char.IsDigit( x ) ) )
                  .OrderBy( r => { int z; int.TryParse( r, out z ); return z; } )
                  .Union( input.Where( m => m.Any( x => !char.IsDigit( x ) ) )
                               .OrderBy( q => q ) );

Use the other overload of OrderBy that takes an IComparer parameter.

You can then implement your own IComparer that uses int.TryParse to tell if it's a number or not.

I'd say you could split up the values using a RegularExpression (assuming everything is an int) and then rejoin them together.

//create two lists to start
string[] data = //whatever...
List<int> numbers = new List<int>();
List<string> words = new List<string>();

//check each value
foreach (string item in data) {
    if (Regex.IsMatch("^\d+$", item)) {
        numbers.Add(int.Parse(item));
    }
    else {
        words.Add(item);
    }
}

Then with your two lists you can sort each of them and then merge them back together in whatever format you want.

Skizz

You could just use function provided by the Win32 API:

[DllImport ("shlwapi.dll", CharSet=CharSet.Unicode, ExactSpelling=true)]
static extern int StrCmpLogicalW (String x, String y);

and call it from an IComparer as others have shown.

public static int? TryParse(string s)
{
    int i;
    return int.TryParse(s, out i) ? (int?)i : null;
}

// in your method
IEnumerable<string> input = new string[] {"a", "1","2", "b", "10"};
var list = input.Select(s => new { IntVal = TryParse(s), String =s}).ToList();
list.Sort((s1, s2) => {
    if(s1.IntVal == null && s2.IntVal == null)
    {
        return s1.String.CompareTo(s2.String);
    }
    if(s1.IntVal == null)
    {
        return 1;
    }
    if(s2.IntVal == null)
    {
        return -1;
    }
    return s1.IntVal.Value.CompareTo(s2.IntVal.Value);
});
input = list.Select(s => s.String);

foreach(var x in input)
{
    Console.WriteLine(x);
}

It still does the conversion, but only once/item.

You could use a custom comparer - the ordering statement would then be:

var result = input.OrderBy(s => s, new MyComparer());

where MyComparer is defined like this:

public class MyComparer : Comparer<string>
{
    public override int Compare(string x, string y)
    {

        int xNumber;
        int yNumber;
        var xIsNumber = int.TryParse(x, out xNumber);
        var yIsNumber = int.TryParse(y, out yNumber);

        if (xIsNumber && yIsNumber)
        {
            return xNumber.CompareTo(yNumber);
        }
        if (xIsNumber)
        {
            return -1;
        }
        if (yIsNumber)
        {
            return 1;
        }
        return x.CompareTo(y);
    }
}

Although this may seem a bit verbose, it encapsulates the sorting logic into a proper type. You can then, if you wish, easily subject the Comparer to automated testing (unit testing). It is also reusable.

(It may be possible to make the algorithm a bit clearer, but this was the best I could quickly throw together.)

You could also "cheat" in some sense. Based on your description of the problem, You know any String of length 2 will be a number. So just sort all the Strings of length 1. And then sort all the Strings of length 2. And then do a bunch of swapping to re-order your Strings in the correct order. Essentially the process will work as follows: (assuming your data is in an array.)

Step 1: Push all Strings of length 2 to the end of the array. Keeping track of how many you have.

Step 2: In place sort the Strings of length 1 and Strings of length 2.

Step 3: Binary search for 'a' which would be on the boundary of your two halves.

Step 4: Swap your two digit Strings with the letters as necessary.

That said, while this approach will work, does not involve regular expressions, and does not attempt to parse non-int values as an int -- I don't recommend it. You'll be writing significantly more code than other approaches already suggested. It obfuscates the point of what you are trying to do. It doesn't work if you suddenly get two letter Strings or three digit Strings. Etc. I'm just including it to show how you can look at problems differently, and come up with alternative solutions.

Use a Schwartzian Transform to perform O(n) conversions!

private class Normalized : IComparable<Normalized> {
  private readonly string str;
  private readonly int val;

  public Normalized(string s) {
    str = s;

    val = 0;
    foreach (char c in s) {
      val *= 10;

      if (c >= '0' && c <= '9')
        val += c - '0';
      else
        val += 100 + c;
    }
  }

  public String Value { get { return str; } }

  public int CompareTo(Normalized n) { return val.CompareTo(n.val); }
};

private static Normalized In(string s) { return new Normalized(s); }
private static String Out(Normalized n) { return n.Value; }

public static IList<String> MixedSort(List<String> l) {
  var tmp = l.ConvertAll(new Converter<String,Normalized>(In));
  tmp.Sort();
  return tmp.ConvertAll(new Converter<Normalized,String>(Out));
}

I had a similar problem and landed here: sorting strings that have a numeric suffix as in the following example.

Original:

"Test2", "Test1", "Test10", "Test3", "Test20"

Default sort result:

"Test1", "Test10", "Test2", "Test20", "Test3"

Desired sort result:

"Test1", "Test2", "Test3, "Test10", "Test20"

I ended up using a custom Comparer:

public class NaturalComparer : IComparer
{

    public NaturalComparer()
    {
        _regex = new Regex("\\d+$", RegexOptions.IgnoreCase);
    }

    private Regex _regex;

    private string matchEvaluator(System.Text.RegularExpressions.Match m)
    {
        return Convert.ToInt32(m.Value).ToString("D10");
    }

    public int Compare(object x, object y)
    {
        x = _regex.Replace(x.ToString, matchEvaluator);
        y = _regex.Replace(y.ToString, matchEvaluator);

        return x.CompareTo(y);
    }
}   

HTH ;o)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!