Calculating the complexity of algorithm to print all valid (i.e., properly opened and closed) combinations of n-pairs of parentheses

我的梦境 提交于 2021-02-11 07:50:48

问题


I would like your opinion on the time and space complexity of this algorithm I implemented (in Python) to calculate the complexity of algorithm to print all valid (i.e., properly opened and closed) combinations of n-pairs of parentheses (see all valid combinations of n-pair of parenthesis)

def find_par_n(n):
    s = set(["()"])
    for i in range(2, n + 1):
        set_to_add = set()
        for str_ in s:
            set_temp = set()
            ana = set()
            for j in range(len(str_) + 1):
                str_c = str_[0:j] + '(' + str_[j:]
                if str_c in ana:
                    continue
                ana.add(str_c)
                for k in range(j + 1, len(str_) + 1):
                    str_a = str_c
                    str_a = str_a[0:k] + ')' + str_a[k:]
                    set_temp.add(str_a)
            set_to_add.update(set_temp)
        s = set_to_add
    return s

Most likely the algorithm can be improved in terms of time and space. Please share your thoughts.


回答1:


Let's try with a simpler version instead.

Some theorems:

  1. Properly paired string are of even length. Don't ask me to prove this one, please.
  2. Given a properly paired string of length 2k, it is possible to construct all strings of length 2(k + 1) by inserting pairs of parenthesis '()' at every possible location within the string. There are 2k + 1 such positions.
  3. To generate all n pairs, we need to recursive into the insertion step n times (and get strings of length 2n.

Note that this is not enough to generate all unique properly paired strings, because e.g. inserting () into () yields the same string twice (()()). However as an upper bound it should suffice.

def add_parens(s, nparens):    
    if not s:    
       add_parens('()', nparens)    
    max_length = 2 * nparens    
       if len(s) == max_length:    
       print(s)    
    else:    
        for i in range(0, len(s)):    
            add_parens(s[0:i] + '()' + s[i:], nparens)    

n = 5      
add_parens('', n)  

Time complexity:

  1. There is 1 insertion point for the empty string.
  2. There are 3 insertion points for (). ...

So all in all:

T(n) = 1 * 3 * ... * 2n + 1 ~ O(n!)

Space complexity for the recursive version is O(n(2n + 1)), however I'm pretty sure it is possible to bring this down to linear.




回答2:


For optimal results, avoid sets. If you generate each possibility exactly once, you can just put them into a vector.

Here's a very simple recursion which is slightly slower than @bcdan's answer:

def rget(n):
  if n == 0:
    return ['']
  else:
    return [fst + '(' + snd + ')' for i in range(n)
                                  for fst in rget(i)
                                  for snd in rget(n-i-1)]

If you wanted a generator rather than a complete list of results, a variant of the above is probably ideal, but for the case of a complete list, it's quite a bit faster to compute the results of each j from 0 to n, using the previously computed lists instead of a recursive call:

def iget(n):
  res = [['']]
  for j in range(1, n+1):
    res.append([fst + '(' + snd + ')' for i in range(j)
                                      for fst in rget(i)
                                      for snd in rget(j-i-1)])
  return res[n]

This turns out to be an order of magnitude faster (using Python 3.3.2).

Why it works

You can decompose any balanced string into two balanced substrings. This is not a unique decomposition, but it can be made unique by choosing the shortest possible non-empty balanced suffix. The shortest nonempty balanced suffix has the property that the starting ( matches the closing ); if that were not the case, it could be decomposed into two shorter non-empty balanced sequences. So the recursive consists of finding all sequences of the form Fst(Snd) where the sum of the sizes of Fst and Snd is n-1.

Time complexity:

The number of balanced sequences with n pairs of parentheses is the nth Catalan number Cn, which is O(4nn2/3). The above code generates each sequence in O(n) (because the string concatenation takes O(n) time) for a total complexity of O(4nn5/3)




回答3:


Recursive version seems to be very simple and it spends time only on valid branches:

def gen(sofar, l, r, n):
    """
    sofar: string generated so far
    l: used left parentheses
    r: used right parentheses
    n: total required pairs
    """
    result = []
    if l == n and r == n:
        # solution found
        result.append(sofar)
    else:
        if l > r:
            # can close
            result.extend(gen(sofar + ")", l, r + 1, n))
        if l < n:
            # can open
            result.extend(gen(sofar + "(", l + 1, r, n))
    return result

def generate(n):
    return gen("", 0, 0, n)

print generate(4)



回答4:


The number of possible combinations is the Catalan Number. So the complexity is at least the one specified by that number. https://en.wikipedia.org/wiki/Catalan_number




回答5:


I got curious and wrote my own version. Here are some specs (this is all in Python 2.7):

n=8

Mine: 21 ms

Yours: 89 ms

n=10

Mine: 294 ms

Yours: 1564 ms

def get(n):
    
    def find(current):
        out = []
        last_index = 0
        for j in range(current.count('(')):
            last_index = current.index('(',last_index) + 1
            out.append(current[:last_index] + '()' + current[last_index:])
        return out
        
    seed = '()'
    current = '()'
    temp = set(['()'])
    for i in range(n):
        new = set()
        for thing in temp:
            new.update(find(thing))
        temp = new

    return [a[1:-1] for a in temp]

I think the biggest difference is that mine only has 3 for loops, and yours has 4. This happens at the str.count function. Using this built-in probably contributes significantly to speed. Also, after each stage I identified duplicates, and this cuts down exponentially on time.

Actually, the biggest difference I see is that I only insert after parenthesis, and then strip the outer two when it is finished. It creates way fewer duplicates. So ((())()) is created, and reduced to its correct form of (())() after removing the enclosing ().



来源:https://stackoverflow.com/questions/31479647/calculating-the-complexity-of-algorithm-to-print-all-valid-i-e-properly-opene

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!