Comparing anagrams using prime numbers

本秂侑毒 提交于 2019-11-30 14:14:31

Use multiplication instead of addition. Primes are "multiplicatively unique", but not "additively unique".

A slightly more clunky way to do it would require the length of your longest string max_len (or the largest number of any specific character for slightly better performance). Given that, your hash could look like

number_of_a*max_len^51 + number_of_b*max_len^50 + ... + number_of_Z*max_len^0

If you preferred to use primes, multiplication will work better, as previously mentioned.

Of course, you could achieve the same effect by having an array of 52 values instead.

You are trying to compare two sorted strings for equality by comparing two n-bit numbers for equality. As soon as your strings are long enough that there are more than 2^n possible sorted strings you will definitely have two different sorted strings that produce the same n-bit number. It is likely, by the http://en.wikipedia.org/wiki/Birthday_problem, that you will hit problems before this, unless (as with multiplication of primes) there is some theorem saying that you cannot have two different strings from the same number.

In some cases you might save time by using this idea as a quick first check for equality, so that you only need to compare sorted strings if their numbers match.

Don't use prime numbers - prime numbers properties are related to division, not sums. However, the idea is good, you could use bit sets but you would hit another problem - duplicate letters (same problem with primes, 1+1+1=3). So, you can use an integer sets, an array 1...26 of frequency of letters.

Here is an implementation in c# using the prime numbers way:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;

namespace Anag
{
    class Program
    {
        private static int[] primes100 = new int[]
                                            {
                                                3, 7, 11, 17, 23, 29, 37,
                                                47, 59, 71, 89, 107, 131,
                                                163, 197, 239, 293, 353,
                                                431, 521, 631, 761, 919,
                                                1103, 1327, 1597, 1931,
                                                2333, 2801, 3371, 4049,
                                                4861, 5839, 7013, 8419,
                                                10103, 12143, 14591, 17519,
                                                21023, 25229, 30293, 36353,
                                                43627, 52361, 62851, 75431,
                                                90523, 108631, 130363,
                                                156437, 187751, 225307,
                                                270371, 324449, 389357,
                                                467237, 560689, 672827,
                                                807403, 968897, 1162687,
                                                1395263, 1674319, 2009191,
                                                2411033, 2893249, 3471899,
                                                4166287, 4999559, 5999471,
                                                7199369
                                            };

        private static int[] getNPrimes(int _n)
        {
            int[] _primes;

            if (_n <= 100)
                _primes = primes100.Take(_n).ToArray();
            else
            {
                _primes = new int[_n];

                int number = 0;
                int i = 2;

                while (number < _n)
                {

                    var isPrime = true;
                    for (int j = 2; j <= Math.Sqrt(i); j++)
                    {
                        if (i % j == 0 && i != 2)
                            isPrime = false;
                    }
                    if (isPrime)
                    {
                        _primes[number] = i;
                        number++;
                    }
                    i++;
                }

            }

            return _primes;
        }

        private static bool anaStrStr(string needle, string haystack)
        {
            bool _output = false;

            var needleDistinct = needle.ToCharArray().Distinct();

            int[] arrayOfPrimes = getNPrimes(needleDistinct.Count());

            Dictionary<char, int> primeByChar = new Dictionary<char, int>();
            int i = 0;
            int needlePrimeSignature = 1;

            foreach (var c in needleDistinct)
            {
                if (!primeByChar.ContainsKey(c))
                {
                    primeByChar.Add(c, arrayOfPrimes[i]);

                    i++;
                }
            }

            foreach (var c in needle)
            {
                needlePrimeSignature *= primeByChar[c];
            }

            for (int j = 0; j <= (haystack.Length - needle.Length); j++)
            {
                var result = 1;
                for (int k = j; k < needle.Length + j; k++)
                {
                    var letter = haystack[k];
                    result *= primeByChar.ContainsKey(letter) ? primeByChar[haystack[k]] : 0;
                }

                _output = (result == needlePrimeSignature);
                if (_output)
                    break;
            }

            return _output;
        }


        static void Main(string[] args)
        {
            Console.WriteLine("Enter needle");
            var _needle = Console.ReadLine(); ;
            Console.WriteLine("Enter haystack");
            var _haystack = Console.ReadLine(); 

            Console.WriteLine(anaStrStr(_needle, _haystack));
            Console.ReadLine();

        }
    }
    }
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!