O(nlogn) Algorithm - Find three evenly spaced ones within binary string

前端未结

关注

 30  3003

刺人心 2020-11-28 00:07

I had this question on an Algorithms test yesterday, and I can\'t figure out the answer. It is driving me absolutely crazy, because it was worth about 40 points. I figure

30条回答

轻奢々 (楼主)

2020-11-28 00:42

Revision: 2009-10-17 23:00

I've run this on large numbers (like, strings of 20 million) and I now believe this algorithm is not O(n logn). Notwithstanding that, it's a cool enough implementation and contains a number of optimizations that makes it run really fast. It evaluates all the arrangements of binary strings 24 or fewer digits in under 25 seconds.

I've updated the code to include the 0 <= L < M < U <= X-1 observation from earlier today.

Original

This is, in concept, similar to another question I answered. That code also looked at three values in a series and determined if a triplet satisfied a condition. Here is C# code adapted from that:

using System;
using System.Collections.Generic;

namespace StackOverflow1560523
{
    class Program
    {
        public struct Pair
        {
            public T Low, High;
        }
        static bool FindCandidate(int candidate, 
            List arr, 
            List pool, 
            Pair pair, 
            ref int iterations)
        {
            int lower = pair.Low, upper = pair.High;
            while ((lower >= 0) && (upper < pool.Count))
            {
                int lowRange = candidate - arr[pool[lower]];
                int highRange = arr[pool[upper]] - candidate;
                iterations++;
                if (lowRange < highRange)
                    lower -= 1;
                else if (lowRange > highRange)
                    upper += 1;
                else
                    return true;
            }
            return false;
        }
        static List BuildOnesArray(string s)
        {
            List arr = new List();
            for (int i = 0; i < s.Length; i++)
                if (s[i] == '1')
                    arr.Add(i);
            return arr;
        }
        static void BuildIndexes(List arr, 
            ref List even, ref List odd, 
            ref List> evenIndex, ref List> oddIndex)
        {
            for (int i = 0; i < arr.Count; i++)
            {
                bool isEven = (arr[i] & 1) == 0;
                if (isEven)
                {
                    evenIndex.Add(new Pair {Low=even.Count-1, High=even.Count+1});
                    oddIndex.Add(new Pair {Low=odd.Count-1, High=odd.Count});
                    even.Add(i);
                }
                else
                {
                    oddIndex.Add(new Pair {Low=odd.Count-1, High=odd.Count+1});
                    evenIndex.Add(new Pair {Low=even.Count-1, High=even.Count});
                    odd.Add(i);
                }
            }
        }

        static int FindSpacedOnes(string s)
        {
            // List of indexes of 1s in the string
            List arr = BuildOnesArray(s);
            //if (s.Length < 3)
            //    return 0;

            //  List of indexes to odd indexes in arr
            List odd = new List(), even = new List();

            //  evenIndex has indexes into arr to bracket even numbers
            //  oddIndex has indexes into arr to bracket odd numbers
            List> evenIndex = new List>(), 
                oddIndex = new List>(); 
            BuildIndexes(arr, 
                ref even, ref odd, 
                ref evenIndex, ref oddIndex);

            int iterations = 0;
            for (int i = 1; i < arr.Count-1; i++)
            {
                int target = arr[i];
                bool found = FindCandidate(target, arr, odd, oddIndex[i], ref iterations) || 
                    FindCandidate(target, arr, even, evenIndex[i], ref iterations);
                if (found)
                    return iterations;
            }
            return iterations;
        }
        static IEnumerable PowerSet(int n)
        {
            for (long i = (1L << (n-1)); i < (1L << n); i++)
            {
                yield return Convert.ToString(i, 2).PadLeft(n, '0');
            }
        }
        static void Main(string[] args)
        {
            for (int i = 5; i < 64; i++)
            {
                int c = 0;
                string hardest_string = "";
                foreach (string s in PowerSet(i))
                {
                    int cost = find_spaced_ones(s);
                    if (cost > c)
                    {
                        hardest_string = s;
                        c = cost;
                        Console.Write("{0} {1} {2}\r", i, c, hardest_string);
                    }
                }
                Console.WriteLine("{0} {1} {2}", i, c, hardest_string);
            }
        }
    }
}

The principal differences are:

Exhaustive search of solutions
This code generates a power set of data to find the hardest input to solve for this algorithm.
All solutions versus hardest to solve
The code for the previous question generated all the solutions using a python generator. This code just displays the hardest for each pattern length.
Scoring algorithm
This code checks the distance from the middle element to its left- and right-hand edge. The python code tested whether a sum was above or below 0.
Convergence on a candidate
The current code works from the middle towards the edge to find a candidate. The code in the previous problem worked from the edges towards the middle. This last change gives a large performance improvement.
Use of even and odd pools
Based on the observations at the end of this write-up, the code searches pairs of even numbers of pairs of odd numbers to find L and U, keeping M fixed. This reduces the number of searches by pre-computing information. Accordingly, the code uses two levels of indirection in the main loop of FindCandidate and requires two calls to FindCandidate for each middle element: once for even numbers and once for odd ones.

The general idea is to work on indexes, not the raw representation of the data. Calculating an array where the 1's appear allows the algorithm to run in time proportional to the number of 1's in the data rather than in time proportional to the length of the data. This is a standard transformation: create a data structure that allows faster operation while keeping the problem equivalent.

The results are out of date: removed.

Edit: 2009-10-16 18:48

On yx's data, which is given some credence in the other responses as representative of hard data to calculate on, I get these results... I removed these. They are out of date.

I would point out that this data is not the hardest for my algorithm, so I think the assumption that yx's fractals are the hardest to solve is mistaken. The worst case for a particular algorithm, I expect, will depend upon the algorithm itself and will not likely be consistent across different algorithms.

Edit: 2009-10-17 13:30

Further observations on this.

First, convert the string of 0's and 1's into an array of indexes for each position of the 1's. Say the length of that array A is X. Then the goal is to find

0 <= L < M < U <= X-1

such that

A[M] - A[L] = A[U] - A[M]

2*A[M] = A[L] + A[U]

Since A[L] and A[U] sum to an even number, they can't be (even, odd) or (odd, even). The search for a match could be improved by splitting A[] into odd and even pools and searching for matches on A[M] in the pools of odd and even candidates in turn.

However, this is more of a performance optimization than an algorithmic improvement, I think. The number of comparisons should drop, but the order of the algorithm should be the same.

Edit 2009-10-18 00:45

Yet another optimization occurs to me, in the same vein as separating the candidates into even and odd. Since the three indexes have to add to a multiple of 3 (a, a+x, a+2x -- mod 3 is 0, regardless of a and x), you can separate L, M, and U into their mod 3 values:

In fact, you could combine this with the even/odd observation and separate them into their mod 6 values:

and so on. This would provide a further performance optimization but not an algorithmic speedup.

0 讨论(0)

查看其它30个回答