Find whether a tree is a subtree of other

后端 未结 10 2035
一个人的身影
一个人的身影 2020-12-28 11:57

There are two binary trees T1 and T2 which store character data, duplicates allowed.
How can I find whether T2 is a subtree of T1 ? .
T1 has millions of nodes and

10条回答
  •  萌比男神i
    2020-12-28 12:29

    I assume that your tree are immutable trees so you never change any subtree (you don't do set-car! in Scheme parlance), but just you are constructing new trees from leaves or from existing trees.

    Then I would advise to keep in every node (or subtree) an hash code of that node. In C parlance, declare the tree-s to be

     struct tree_st {
       const unsigned hash;
       const bool isleaf;
       union {
         const char*leafstring; // when isleaf is true
         struct { // when isleaf is false
            const struct tree_st* left;
            const struct tree_st* right;
         };
       };
     };
    

    then compute the hash at construction time, and when comparing nodes for equality first compare their hash for equality; most of the time the hash code would be different (and you won't bother comparing content).

    Here is a possible leaf construction function:

    struct tree_st* make_leaf (const char*string) {
       assert (string != NULL);
       struct tree_st* t = malloc(sizeof(struct tree_st));
       if (!t) { perror("malloc"); exit(EXIT_FAILURE); };
       t->hash = hash_of_string(string);
       t->isleaf = true;
       t->leafstring = string;
       return t;
    }
    

    The function to compute an hash code is

    unsigned tree_hash(const struct tree_st *t) {
      return (t==NULL)?0:t->hash;
    }
    

    The function to construct a node from two subtrees sleft & sright is

    struct tree_st*make_node (const struct tree_st* sleft,
                              const struct tree_st* sright) {
       struct tree_st* t = malloc(sizeof(struct tree_st));
       if (!t) { perror("malloc"); exit(EXIT_FAILURE); };
       /// some hashing composition, e.g.
       unsigned h = (tree_hash(sleft)*313) ^ (tree_hash(sright)*617);
       t->hash = h;
       t->left = sleft;
       t->right = sright;
       return t;
     }
    

    The compare function (of two trees tx & ty) take advantage that if the hashcodes are different the comparands are different

    bool equal_tree (const struct tree_st* tx, const struct tree_st* ty) {
      if (tx==ty) return true;
      if (tree_hash(tx) != tree_hash(ty)) return false;
      if (!tx || !ty) return false;
      if (tx->isleaf != ty->isleaf) return false;
      if (tx->isleaf) return !strcmp(tx->leafstring, ty->leafstring);
      else return equal_tree(tx->left, ty->left) 
                  && equal_tree(tx->right, ty->right); 
    

    }

    Most of the time the tree_hash(tx) != tree_hash(ty) test would succeed and you won't have to recurse.

    Read about hash consing.

    Once you have such an efficient hash-based equal_tree function you could use the techniques mentioned in other answers (or in handbooks).

提交回复
热议问题