Number of subsets of {1,2,3,…,N} containing at least 3 consecutive elements

问题

Suppose we have a set like {1,2,3} then there is only one way to choose 3 consecutive numbers... it's the set {1,2,3}...

For a set of {1,2,3,4} we have 3 ways: 123 234 1234

(technically these are unordered sets of numbers, but writing them consecutively helps)

f(5) ; {1,2,3,4,5} -> 8 ways: 123 1234 1235 12345 234 2345 345 1345
f(6) ; {1,2,3,4,5,6} -> 20 ways: ...
f(7) ; {1,2,3,4,5,6,7} -> 47 ways: ...

So for a given N, I can get the answer by applying brute force, and calculating all such subset having 3 or more consecutive number.

Here I am just trying to find out a pattern, a technique to get the number of all such subset for a given N.

The problem is further generalized to .....discover m consecutive number within a set of size N.

回答1:

There is a bijection between this problem and "the number of N-digit binary numbers with at least three consecutive 1s in a row somewhere" (the bijection being a number is 0 if excluded in the subset, and 1 if included in the subset).

This is a known problem, and should be enough information to google for a result, if you search for number of n-digit binary strings with m consecutive 1s, the second hit is Finding all n digit binary numbers with r adjacent digits as 1

Alternatively you can just look it up as http://oeis.org/search?q=0%2C0%2C1%2C3%2C8%2C20%2C47 (based on the brute-forcing you did for the first few terms) - resulting in an explicit formula of 2^n - tribonacci(n+3), see here for an explicit formula for tribonacci numbers. It also gives a recurrence relation. The analogy given is "probability (out of 2^n) of getting at least 1 run of 3 heads within n flips of a fair coin"

I can only assume that the answer to the general problem is 2^n - F_m(n+m), where F_m is the m^th n-step Fibonacci number (edit: that does seem to be the case)

回答2:

This sounds like homework to me, so I'll just get you started. FoOne approach is to think of the Lowest and Highest members of the run, L and H. If the set size is N and your minimum run length is M, then for each possible position P of L, you can work out how many positions of H there are....

回答3:

With a bit of python code, we can investigate this:

y = set()

def cons(li, num):
    if len(li) < num:
        return
    if len(li) == num:
        y.add(tuple([i for i in li]))
    else:
        y.add(tuple([i for i in li]))
        cons(li[1:], num)
        cons(li[:-1], num)

This solution will be quite slow (it's exponential in complexity, actually), but try it out for a few small list sizes and I think you should be able to pick up the pattern.

回答4:

Not sure if you mean consecutive or not. If not, then for {1, 2, 3, 4} there are 4 possibilities: {1, 2, 3} {2, 3, 4} {1, 3, 4} {1, 2, 3, 4}

I think you can calculate the solution with N!/3! where N! = N*(N-1)(N-2)...*1.

回答5:

Quick answer:

Sequences(n) = (n-1)*(n-2) / 2

Long answer:

You can do this by induction. First, I'm going to re-state the problem, because your problem statement isn't clear enough.

Rule 1: For all sets of consecutive numbers 1..n where n is 2 or more
Rule 2: Count the subsets S(n) of consecutive numbers m..m+q where q is 2 or more

S(n=3)

By inspection we find only one - 123

S(n=4)

By inspection we find 3! - 123 234 and 1234

Note that S(4) contains S(3), plus two new ones... both include the new digit 4... hmm.

S(n=5)

By inspection we find ... S(n=4) as well as 345 2345 and 12345. That's 3+3=6 total.

I think there's a pattern forming here. Let's define a new function T.

Rule 3: S(n) = S(n-1) + T(n) ... for some T.

We know that S(n) contains the digit n, and should have spotted by now that S(n) also contains (as a subcomponent) all sequences of length 3 to n that include the digit n. We know they cannot be in S(n-1) so they must be in T(n).

Rule 4: T(n) contains all sequence ending in n that are of length 3 to n.

How many sequences are in S(n)?

Let's look back at S(3) S(4) and S(5), and incorporate T(n):

S(3) = S(3)
S(4) = S(3) + T(4)
S(5) = S(4) + T(5) = S(3) + T(4) + T(5)

let's generalise:

S(n) = S(3) + T(f) for all f from 4 to n.

So how many are in a given T?

Look back at rule 5 - how many sequences does it describe?

For T(4) it describes all sequences 3 and longer ending in 4. (that's 234)

For T(5) it describes all sequences 3 and longer ending in 5. (that's 345 2345 = 2)

T count Examples
4 2     1234 234
5 3     12345 2345 345
6 4     123456 23456 3456 456

Looks awfully like T(n) is simply n-2!

S(6) = T(6) + T(5) + T(4) + S(3)
  10 = 4    + 3    + 2    + 1

And S(7) = 15 = 5 + 4 + 3 + 2 + 1 S(8) = 21 = 6 + 5 + 4 + 3 + 2 + 1

Turning this into a formula

What's 2 * S(8)?

42 = 6 + 5 + 4 + 3 + 2 + 1 + 1 + 2 + 3 + 4 + 5 + 6

Add each pair of biggest and smallest numbers:

42 = 7 + 7 + 7 + 7 + 7 + 7

42 = 7 * 6

But that's 2 * S(8), so

S(8) = 42/2 = 21 = 7 * 6 / 2

This generalizes:

S(n) = (n-1)*(n-2) / 2

Let's check this works:

S(3) = 2*1/2 = 1
S(4) = 3*2/2 = 3
S(5) = 4*3/2 = 6
S(6) = 5*4/2 = 10

I'm satisfied.

来源：https://stackoverflow.com/questions/12311918/number-of-subsets-of-1-2-3-n-containing-at-least-3-consecutive-elements

标签

math

combinatorics