All ways to partition a string

半腔热情 提交于 2021-01-17 11:11:20

问题


I'm trying to find a efficient algorithm to get all ways to partition a string

eg for a given string 'abcd' =>
'a' 'bcd'
'a' 'b' 'cd'
'a' 'b' 'c' 'd'
'ab' 'cd'
'ab' 'c' 'd'
'abc' 'd'
'a', 'bc', 'd

any language would be appreciated

Thanks in advance !


回答1:


Problem analysis

Between each pair of adjacent characters, you can decide whether to cut. For a string of size n, there are n-1 positions where you can cut or not, i.e. there are two possibilities. Therefore there are 2^(n-1) partitions for each string of size n.

Due to the size of the output alone (2^(n-1) partitions, each consisting of n characters from the string + seperators), an algorithm to solve this can have exponential runtime at best (2^(n-1) * nO(2^n)).


Solution

I chose to represent a partition as integer. Each bit in cutpoints determines whether to cut between characters i and i+1. To iterate through all possible partitions, we just need to go trough all integers between 0 and 2^(n-1) - 1.

Example: For a string of length 4, we go through all integers between 0 and 2^3 - 1 or 0 and 7 or in binary: 000 and 111.

# (python 2 or 3)
def all_partitions(string):
    for cutpoints in range(1 << (len(string)-1)):
        result = []
        lastcut = 0
        for i in range(len(string)-1):
            if (1<<i) & cutpoints != 0:
                result.append(string[lastcut:(i+1)])
                lastcut = i+1
        result.append(string[lastcut:])
        yield result

for partition in all_partitions("abcd"):
    print(partition)

Memory usage:

I think my solution uses O(n) memory with Python 3. Only one partition is generated at a time, it's printed and not referenced anymore. This changes of course, if you keep all results, e.g. by storing them in a list.

In Python 2 replace range with xrange, otherwise all possible cutpoints will be stored in a list, therefore needing an exponential amount of memory.


JavaScript solution

// ES6 generator
function* all_partitions(string) {
    for (var cutpoints = 0; cutpoints < (1 << (string.length - 1)); cutpoints++) {
        var result = [];
        var lastcut = 0;
        for (var i = 0; i < string.length - 1; i++) {
            if (((1 << i) & cutpoints) !== 0) {
                result.push(string.slice(lastcut, i + 1));
                lastcut = i + 1;
            }
        }
        result.push(string.slice(lastcut));
        yield result;
    }
}

for (var partition of all_partitions("abcd")) {
    console.log(partition);
}

Tested with NodeJS v4.4.3 (disclaimer: I have not used NodeJS before).




回答2:


This is a solution which minimizes developer time by taking advantage of a built-in iterator. It should be reasonably quick for problem sizes for which the answer itself is not infeasibly large.

There is a one-to-one correspondence between partitions of a string and subsets of potential cutpoints. If the length of the string is n then there are n-1 places where you could cut the string. A straightforward way would be to iterate through such subsets, and for each such subset, slice the string in that way. Here is a Python approach which uses the standard modules itertools:

import itertools

def multiSlice(s,cutpoints):
    k = len(cutpoints)
    if k == 0:
        return [s]
    else:
        multislices = [s[:cutpoints[0]]]
        multislices.extend(s[cutpoints[i]:cutpoints[i+1]] for i in range(k-1))
        multislices.append(s[cutpoints[k-1]:])
        return multislices

def allPartitions(s):
    n = len(s)
    cuts = list(range(1,n))
    for k in range(n):
        for cutpoints in itertools.combinations(cuts,k):
            yield multiSlice(s,cutpoints)

For example:

>>> parts = allPartitions('World')
>>> for p in parts: print(p)

['World']
['W', 'orld']
['Wo', 'rld']
['Wor', 'ld']
['Worl', 'd']
['W', 'o', 'rld']
['W', 'or', 'ld']
['W', 'orl', 'd']
['Wo', 'r', 'ld']
['Wo', 'rl', 'd']
['Wor', 'l', 'd']
['W', 'o', 'r', 'ld']
['W', 'o', 'rl', 'd']
['W', 'or', 'l', 'd']
['Wo', 'r', 'l', 'd']
['W', 'o', 'r', 'l', 'd']

Note that this approach produces generates ['World'] as a partition of 'World'. This corresponds to slicing with an empty set of cut points. I regard that as a feature rather than a bug since the standard mathematical definition of partition allows for a partition of a set into one piece. If this in undesirable for your purposes, the fix is easy enough -- just iterate over the nonempty subsets of the cut points. In terms of the above code, this fix amounts to adding two characters to allPartitions: replace

for k in range(n):

by

for k in range(1,n):



回答3:


GeeksforGeeks has provided a well-explained solution to this problem:

For string abcd there will be 2^(n-1) i.e. 8 partitions.

(a)(b)(c)(d)
(a)(b)(cd)
(a)(bc)(d)
(a)(bcd)
(ab)(c)(d)
(ab)(cd)
(abc)(d)
(abcd)

The crux of the solution lies in the recursion to print all the permutations.
maintain two parameters – index of the next character to be processed and the output string so far. We start from index of next character to be processed, append substring formed by unprocessed string to the output string and recurse on remaining string until we process the whole string.

// Java program to find all combinations of Non-
// overlapping substrings formed from given
// string

class GFG 
{
    // find all combinations of non-overlapping
    // substrings formed by input string str
    static void findCombinations(String str, int index,
                                 String out)
    {
        if (index == str.length())
            System.out.println(out);

        for (int i = index; i < str.length(); i++)

            // append substring formed by str[index,
            // i] to output string
            findCombinations(str, i + 1, out +
                "(" + str.substring(index, i+1) + ")" );
    }

    // driver program
    public static void main (String[] args) 
    {
        // input string
        String str = "abcd";
        findCombinations(str, 0, "");
    }
}

Time Complexity is O(2^n)

Here's the link to the article: http://www.geeksforgeeks.org/print-ways-break-string-bracket-form/




回答4:


Something along the lines of the following (untested and likely buggy VB.NET sample)

Function FindAllGroups(s As String) As List(Of List(Of String))
    Dim ret As New List(Of List(Of String))
    Dim l As New List(Of String)
    l.Add(s) 'the whole string unbroken
    ret.Add(l) 'first option we return is the whole unbroken string by itself
    If s.Length > 1 Then
        Dim tmp = FindAllGroups(s.Substring(1)) 'find all the groups for the rest of the string after the first character
        For Each l2 in tmp
            l = l2.ToList 'Copy it
            l.Insert(s.SubString(0,1),0)'insert the first character from this string by itself before this combination for the rest of the string
            ret.Add(l)
        Next
        For Each l2 in tmp
            l = l2.ToList 'Copy it
            l(0)= s.SubString(0,1) & l(0) 'insert the first character from this string as part of the first element in the list
            ret.Add(l)
        Next
   End If
   Return ret
End Function

This basically works by saying that we can take 'abcd' and split it into

'a', 1st option for 'bcd' split
'a', 2nd option for 'bcd' split
...
+
1st option for 'bcd' split with the first element prepended with 'a'
2nd option for 'bcd' split with the first element prepended with 'a'
...

then to calculate 'bcd', we just repeat the process as above, only with

'b', 1st option for 'cd' split
'b', 2nd option for 'cd' split
...
+
1st option for 'cd' split with the first element prepended with 'b'
2nd option for 'cd' split with the first element prepended with 'b'
...

etc. repeated recursively.

However, this code isn't particularly efficient at runtime. One thing that you could do to speed it up significantly would be to add a Dictionary(Of String, List(Of List(Of String)) outside the function which you can store a cache of the results in and if the item exists in there, you return from there, if not, calculate it and add it. Lists also might not be the most efficient, and the ToList function might not be the quickest way of cloning. However, I've simplified it to make it easier to understand and also to save me time working it out!




回答5:


I just wanted to post a simple recursive solution to this problem for anyone stumbling on this question. Probably not the best way, but this was way simpler for me to understand and implement. If I am wrong, please correct me.

def party(s:str, P:list, res:list) -> None :
    """Recursively generates all partitions of a given string"""
    res.append(P+[s])
    for i in range(1,len(s)):
        party(s[i:],P+[s[:i]],res)

res = []
party("abcd",[],res)
print(res)
"""
[['abcd'], ['a', 'bcd'], ['a', 'b', 'cd'], ['a', 'b', 'c', 'd'], 
['a', 'bc', 'd'], ['ab', 'cd'], ['ab', 'c', 'd'], ['abc', 'd']]
"""

It works as follows: Given a string or a substring of it, we can split after each of its character creating two halves. Say: "abc" can be partitioned into ["a","bc"], ["ab","c"]

We save the first part in a intermediate partition P and recursively call party on the other half.

Because both halves together form a complete partition we save it to res. Example:

initially: s = "abc" is a valid partition, save it to res.

recr call: s = "bc", P = ["a"] , so P +[s]= ["a","bc"] is also valid, save it to res.

Proceed with splitting "bc". P = ["a","b"], s="c" so P + [s] is also valid. And so on..

recr call 3: s = "c", P = ["ab"], so P + [s] =["ab","c"] is also valid, save it to res

Working:

tests = ["abc","abcd","a"]
for t in tests:
    res = []
    party(t,[],res)
    print(f'{t} -> {res} \n')

"""Output
abc -> [['abc'], ['a', 'bc'], ['a', 'b', 'c'], ['ab', 'c']] 

abcd -> [['abcd'], ['a', 'bcd'], ['a', 'b', 'cd'], ['a', 'b', 'c', 'd'], 
['a', 'bc', 'd'], ['ab', 'cd'], ['ab', 'c', 'd'], ['abc', 'd']] 

a -> [['a']] 
"""


来源:https://stackoverflow.com/questions/37023774/all-ways-to-partition-a-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!