Safe non-trivial data dependencies/custom references?

问题

One of the central features of Rust is the compile-time enforced safety of references, which is achieved though ownership mechanics and explicit lifetimes. Is it possible to implement 'custom' references that would benefit from the same?

Consider the following example. We have an object that represents a graph. Assume that we can traverse the graph by referencing its edges, however, these references are implemented as custom indices rather then pointers to some memory. Such an index could be simply an offset into an array (or three), but it also could be a struct that combines some flags etc.

Besides traversing the graph, we can also modify it, which means that references to its internal state (edges) get invalidated. Ideally, we would want the compiler to catch any of these invalid references. Can we do this in Rust? E.g.:

// get a reference to an edge
let edge = graph.get_random_edge()
// the next statement yields the ownership of the edge reference
// back to the graph, which can invalidate it 
edge.split() 
edge.next() // this will be a compile-time error as the edge is gone!

// another example
let edge1 = graph.get_random_edge()
let edge2 = graph.get_random_edge()
// this will be a compile-time error because the potentially invalid
// edge2 reference is still owned by the code and has not been
// yielded to the graph 
edge1.split()

P.S. Sorry for the non-informative title, I was not sure how to phrase it...

回答1:

Yes

It is perfectly possible to leverage ownership and borrow-checking to build your own safety checks, and this is actually a very exciting area of exploration that is opened to us.

I'd like to start with existing cool things:

Sessions Types are about encoding state machines in the type system:
- The "state" is encoded as a type
- The "transition" is encoded as a method consuming one value and producing another of a possibly different type
- As a result: (1) transitions are checked at runtime and (2) it's impossible to use an old state
There are tricks to use borrowing to forge a guaranteed valid index for a particular collection (related to branding):
- The index borrows the collection, guaranteeing the collection cannot be modified
- The index is forged with an invariant lifetime, which ties it to this instance of the collection, and no other
- As a result: the index can be used only with this collection, and without bounds checking

Let's get to your examples:

// get a reference to an edge
let edge = graph.get_random_edge()
// the next statement yields the ownership of the edge reference
// back to the graph, which can invalidate it 
edge.split() 
edge.next() // this will be a compile-time error as the edge is gone!

This is actually trivial.

In Rust you can define a method to take ownership of its receiver:

impl Edge {
   fn split(self) { ... }
         // ^~~~ Look, no "&"
}

Once the value is consumed, it cannot be used any longer, and therefore the call to next is invalid.

I suppose that you would want Edge to keep a reference to the graph, to prevent the graph from being modified whilst you have an outstanding edge:

struct Edge<'a> {
    graph: &'a Graph,  // nobody modifies the graph while I live!
}

will do the trick.

Moving on:

// another example
let edge1 = graph.get_random_edge()
let edge2 = graph.get_random_edge()
// this will be a compile-time error because the potentially invalid
// edge2 reference is still owned by the code and has not been
// yielded to the graph 
edge1.split()

This is not possible, as is.

To enforce order, the values must be linked together, and here edge1 and edge2 are not.

A simple solution is to require that edge1 act as mandatory proxy for the graph:

struct Edge<'a> {
    graph: &'a mut Graph,  // MY PRECIOUS!
                           // You'll only get that graph over my dead body!
}

Then, we implement a getter, to get access to the graph temporarily:

impl<'a> Edge<'a> {
    fn get_graph<'me>(&'me mut edge) -> &'me mut Graph;
}

And uses that result (named graph2 for convenience) to obtain edge2.

This creates a chain of obligations:

Nobody can touch graph until edge1 dies
Nobody can touch edge1 until graph2 dies
Nobody can touch graph2 until edge2 dies

which enforces that the objects are released in the correct order.

At compile time.

\o/

Safety Note: An important event early after Rust release was the LeakPocalypse (scoped_thread being found to be unsound), which led Gankro (who wrote and shepherded std::collections) to write Pre-pooping Your Pants with Rust which I encourage you to read. The short of it is that you should NEVER rely on a destructor being executed for safety, because there's no guarantee it will (the object could be leaked and then the thread unwind by panic). Pre-Pooping Your Pants is the strategy proposed by Gankro to work around that: put the element in a valid and safe (if semantically wrong) state, do your stuff, restore the real semantics on destruction, and is what is used by the Drain iterator.

回答2:

You can add lifetimes to your Edge struct, and borrow the Graph in the get_random_edge method:

struct Graph;

impl Graph {
    fn get_random_edge<'a>(&'a self) -> Edge<'a> {
        Edge(self)
    }
    fn get_random_edge_mut<'a>(&'a mut self) -> MutEdge<'a> {
        MutEdge(self)
    }
}

struct MutEdge<'a>(&'a mut Graph);

impl<'a> MutEdge<'a> {
    fn split(self) {}
    fn next(&'a mut self) -> MutEdge<'a> {
        MutEdge(self.0)
    }
}

struct Edge<'a>(&'a Graph);

impl<'a> Edge<'a> {
    fn split(self) {}
    fn next(&'a self) -> Edge<'a> {
        Edge(self.0)
    }
}

This will give the following errors:

37 |         edge.split();
   |         ---- value moved here
38 |         edge.next(); // this will be a compile-time error as the edge is gone!
   |         ^^^^ value used here after move

And

error[E0499]: cannot borrow `graph` as mutable more than once at a time
  --> <anon>:43:17
   |
42 |     let edge1 = graph.get_random_edge_mut();
   |                 ----- first mutable borrow occurs here
43 |     let edge2 = graph.get_random_edge_mut();
   |                 ^^^^^ second mutable borrow occurs here

If you don't want to store a reference to the Graph in the edge, but just the index, you can simply replace the &'a mut Graph with PhantomData<&'a mut Graph>, which doesn't take up memory, but has the same semantics.

来源：https://stackoverflow.com/questions/41852283/safe-non-trivial-data-dependencies-custom-references

标签

rust

type-safety

borrow-checker