Maximum and minimum number of edges in a suffix tree

天涯浪子 提交于 2019-12-24 03:51:15

问题


What are the maximum and minimum number of edges in a suffix tree? I know the maximum is 2m-1, but I don't understand why that is so.


回答1:


First, about the maximum number of edges:

It's quite easy to understand if you think of the edges as coming in two flavours: Edges that lead to a leaf node, and edges that lead to an inner node. In the following, I'll assume that the string the suffix tree is constructed for is N characters in length.

  1. About the edges leading to leaves. There must be exactly one leaf for every suffix, and every leaf has only one inbound edge (and no outbound edges). So there must be N edges leading to leaves.

  2. About the edges leading to inner nodes. Like with leaf nodes, inner nodes, too, have only one inbound edge each. Therefore, to determine how many edges leading to inner nodes there can possibly be, it suffices to determine how many inner nodes there can be. So, what is the maximum possible number of inner nodes?

    For this it is important to see that inner nodes are only inserted into the suffix tree at branching points, i.e. the number of outbound edges of an inner node is always at least 2 (if there was only one outbound edge, the inner node wouldn't have been constructed in the first place). But every outbound edge must eventually lead to a leaf node (possibly after going through further inner nodes). In other words, every inner node inserted into the tree increases the total number of leaf nodes by at least 1. Furthermore, even a tree with no inner nodes whatsoever must have at least one edge going out of the root node (unless the tree is empty). Hence, in a non-empty tree, the total number of leaves L must be

    L >= I + 1
    

    where I is the number of inner nodes. Conversely, the number of inner nodes is

    I <= L - 1 = N - 1
    

    Returning to the original question, the number of edges leading to inner nodes is, as we said, the same as the number of inner nodes, hence it is also bound by N - 1.

We conclude that the total number of edges is the number of edges leading to leaves (N) plus the number of edges leading to inner nodes (<=N-1), hence its maximum is bound by

N + (N-1) = 2N - 1

quod erat demonstrandum.


Regarding the minimum number of edges: This follows the same principle, i.e. we calculate the number of edges leading to leaves and the number of edges leading to internal nodes, and then add them together.

The number of nodes leading to leaves is always N, i.e. N is the maximum as well as the minimum possible.

The number of nodes leading to internal nodes, however, may be zero. E.g. when the input string has no repeated elements, such as abcdef, there are exactly N edges, each from the root straight to a leaf. No branching points, no internal nodes. Hence the minimum number of edges leading to internal nodes is 0.

In conclusion, the minimum number of edges is N + 0 = N.



来源:https://stackoverflow.com/questions/12865639/maximum-and-minimum-number-of-edges-in-a-suffix-tree

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!