How to reuse codes for Binary Search Tree, Red-Black Tree, and AVL Tree?

问题

I'm implementing BST (Binary Search Tree), RBT(Red-Black Tree), and AVLT (AVL Tree). I wrote a BST as follows:

use std::cell::RefCell;
use std::rc::Rc;
use std::cmp::max;

type RcRefBaseNode<T> = Rc<RefCell<BaseNode<T>>>;
type BaseNodeLink<T> = Option<RcRefBaseNode<T>>;

struct BaseNode<T: Ord> {
    pub key: T,
    left: BaseNodeLink<T>,
    right: BaseNodeLink<T>,
}

struct BaseTree<T: Ord> {root: BaseNodeLink<T>}
impl<T: Ord> BaseTree<T> {
    fn new(data: T) -> Self {
        Self{
            root: Some(Rc::new(RefCell::new(BaseNode{
                key: data,
                left: None,
                right: None
            })))
        }
    }

    fn is_empty(&self) -> bool {
        match self.root {
            None => false,
            Some(_) => true
        }
    }

    fn height(&self) -> usize {
        match &self.root {
            None => 0,
            Some(node) => node.borrow().height(),
        }
    }

    fn print_in_order(&self) {
        unimplemented!()
    }

    fn count_leaves(&self) {
        unimplemented!()
    }
}

impl <T: Ord> BaseNode<T> {
    fn height(&self) -> usize {
        let left_height: usize;
        match &self.left {
            None => left_height = 0,
            Some(node) => left_height = node.borrow().height(),
        };

        let right_height: usize;
        match &self.right {
            None => right_height = 0,
            Some(node) => right_height = node.borrow().height(),
        };
        max(left_height, right_height) + 1
    }

    fn print_in_order(&self) {
        unimplemented!()
    }

    fn count_leaves(&self) -> usize {
        unimplemented!()
    }
    // other functions for querying
}

There will have more methods for querying information about that tree. And I realized that the query functions for RBT and AVLT will be the same as the query functions of BST. The structs for RBT and AVLT are as follows:

// ============== Red-Black Tree ============== //
type RcRefRBTNode<T> = Rc<RefCell<RedBlackTreeNode<T>>>;
type RBNodeLink<T> = Option<RcRefRBTNode<T>>;

#[derive(Clone, Debug, PartialEq)]
enum NodeColor {
    Red,
    Black,
}

struct RedBlackTreeNode<T> {
    pub key: T,
    pub color: NodeColor,
    parent: RBNodeLink<T>,
    left: RBNodeLink<T>,
    right: RBNodeLink<T>,
}

struct RedBlackTree<T: Ord> {root: RBNodeLink<T>}
// same implementation for query functions like is_empty, height, count_leaves, etc.

and

// ============== AVL Tree ============== //
type RcRefAVLTNode<T> = Rc<RefCell<AVLTreeNode<T>>>;
type AVLNodeLink<T> = Option<RcRefAVLTNode<T>>;

struct AVLTreeNode<T> {
    pub key: T,
    pub height: usize,
    parent: AVLNodeLink<T>,
    left: AVLNodeLink<T>,
    right: AVLNodeLink<T>,
}

struct AVLTree<T: Ord> {root: AVLNodeLink<T>}
// same implementation for query functions like is_empty, height, count_leaves, etc.

I understand that they will have different codes for insert and delete, but how to reuse the codes for the query functions?

回答1:

You can extract the notion of a queryable node into a trait, then write height using that, then implement that trait for each of your node types.

For example, the height function only uses the left and right members - but traits can't access members, only functions - so we need some accessor functions. Your other query functions will probably mean you need to add more functions to this.

trait QueryableTreeNode {
    fn left(&self) -> &Option<Rc<RefCell<Self>>>;
    fn right(&self) -> &Option<Rc<RefCell<Self>>>;
}

Now the height function can be rewritten using this trait

fn height<QTN: QueryableTreeNode>(qn:&QTN) -> usize {
    let left_height = match &qn.left() {
        None => 0,
        Some(node) => height(&*node.borrow()),
    };

    let right_height = match &qn.right() {
        None => 0,
        Some(node) => height(&*node.borrow()),
    };
    max(left_height, right_height) + 1
}

Next we make our node types implement the trait

impl <T:Ord> QueryableTreeNode for BaseNode<T> {
    fn left(&self) -> &BaseNodeLink<T> { return &self.left; }
    fn right(&self) -> &BaseNodeLink<T> { return &self.right; }
}

impl <T:Ord> QueryableTreeNode for AVLTreeNode<T> {
    fn left(&self) -> &AVLNodeLink<T> { return &self.left; }
    fn right(&self) -> &AVLNodeLink<T> { return &self.right; }
}

Finally, we implement the function in the tree types, by handing off to our newfunction on the nodes:

impl<T> BaseTree<T> {
   fn height(&self) -> usize {
        match &self.root {
            None => 0,
            Some(node) => height(&*node.borrow()),
        }
    }
}

You can find a compiling version of this in the playground.

回答2:

I'll try to answer a more general question here related to code reuse in Rust. If your primary goal is to write the least amount of code as possible, then Rust might not be the tool you want.

That said, there are three major methods you should consider: monomorphization, virtualization, and enumeration.

Per the Rust book:

Monomorphization is the process of turning generic code into specific code by filling in the concrete types that are used when compiled.

You'll be doing a little copy/paste with monomorphization. Then you have virtualization. In Rust this usually means using trait objects.

Finally, we have enumeration. I like enumeration for certain tasks. Using an enum you can specify some top-level type like Tree and then enumerate the kinds of trees you wish to build.

Here's an example:

struct AVLTree;
struct BSTTree;

enum Tree {
    AVL(AVLTree),
    BST(BSTTree),
}

fn traverse(tree: Tree) {
    match tree {
        Tree::AVL(avl) => traverse_avl(avl),
        Tree::BST(bst) => traverse_bst(bst),
    }
}

fn traverse_avl(tree: AVLTree) {}
fn traverse_bst(tree: BSTTree) {}

Going off your comments and concerns with having to copy/paste code I can tell that the above solution isn't exactly what you are looking for.

It might be possible to do what you're trying to do using virtualization––but it'll cost you. Here is a resource that describes a plethora of methods of code reuse in rust (along with a list of pros/cons).

Now I'll follow up on the more general topic of code reuse. Writing code is about making choices, making tradeoffs. Reusing code is about reusing resources––time, energy, logic, etc.

While "don't repeat yourself" (DRY) is a generally a good rule to follow, there's times when it's warranted and times when, well... you get the idea. Reusing code and coding in a DRY way can also be a sign of code quality.

That's it. DRY code can be a sign of code quality, that's it. Note that I didn't say good or bad quality. Just quality. Sometimes reuse can be a good thing, sometimes it's a bad thing. Sometimes it results in maintainable code. Sometimes it results in code that is hard to maintain.

Sometimes the code that's easiest to maintain is code that can be easily copied, pasted, and/or deleted.

回答3:

One common pattern in generic programming, when you have several types that are partially shared, is to put the shared part into a generic type and the unique parts in its type parameters (which may be required to satisfy certain traits).

You have three structures: BaseTree, AvlTree and RedBlackTree. They differ in how the tree is balanced. "Balancing" is a behavior, so let's make a trait that contains the code that balances a particular kind of tree:

trait Balance {
    type Data; // extra data that each node must store to balance the tree
    fn insert_balanced<T>(&self, root: &mut Link<T, Self>, value: T);
}

(In this example I'll just consider insert_balanced; in your real code you would probably want to add remove_balanced, plus perhaps other balanceable actions, like retain, union, intersection, split, etc. Whatever is relevant to the API you want to provide a shared interface over.)

Here's what our Node and BaseTree types look like:

struct Node<T, B: ?Sized + Balance> {
    value: T,
    balance_data: B::Data,
    left: Link<T, B>,
    right: Link<T, B>,
}

type Link<T, B> = Option<Box<Node<T, B>>>;

struct BaseTree<T, B: Balance> {
    root: Link<T, B>,
    // Putting the balancer as an actual field of the struct allows you to defer
    // the choice of balancer to runtime, like in `BaseTree<T, Box<dyn Balance>>`
    balancer: B,
}

Now you can write algorithms that are common to all binary search trees on BaseTree<_, _>, and for the algorithms that require special logic, defer to the balancer:

impl<T, B: Balance> BaseTree<T, B> {
    fn search(&self, _value: &T) -> Option<&T> {
        // Here write code to traverse the binary search tree ignoring the balancing part
        todo!()
    }

    fn insert(&mut self, value: T) {
        // Here defer to the implementation of `insert_balanced` for `B`
        self.balancer.insert_balanced(&mut self.root, value)
    }
}

Implementing a specific kind of self-balancing tree becomes a matter of creating a type that implements Balance and parameterizing BaseType with it. The easiest one is a plain binary search tree, which contains no balancing code:

struct NoBalancer {}

impl Balance for NoBalancer {
    type Data = (); // unbalanced trees require no extra data
    fn insert_balanced<T>(&self, _root: &mut Link<T, Self>, _value: T) {
        todo!() // code to just insert with no balancing at all
    }
}

type BinarySearchTree<T> = BaseTree<T, NoBalancer>;

(Playground with stubs for all three kinds of trees.)

This resembles a statically dispatched version of the strategy pattern. A type that implements Balance is a balancing strategy. In this case the pattern is used not primarily to allow runtime dispatch on the strategy (although you could do that), but to separate code that balances and code that does not care about balancing so that the latter can be reused.

来源：https://stackoverflow.com/questions/64743600/how-to-reuse-codes-for-binary-search-tree-red-black-tree-and-avl-tree

标签

rust