How to use (unsafe) aliasing?

问题

Rust has strict aliasing rules. But can I work around them if "I know what I'm doing"?

I'm trying to convert to Rust a C function that performs a complicated operation by reading from input buffer and writing to a destination buffer, but it has a clever optimization that allows the input and output buffer to be the same:

foo(src, dst); // result is written to dst
foo(buf, buf); // legal in C, does the operation in-place

For the sake of the question let's say it's something like:

void inplace(char *src, char *dst, int len) {
   for(int i=0; i < len-1; i++) {
      dst[i] = src[i+1] * 2; // algorithm works even if src == dst
   }
}

In safe subset of Rust I'd have to have two nearly copy & pasted versions of the function fn(&mut) and fn(&, &mut).

Is there a way to cheat Rust to get both mutable and immutable reference to the same buffer?

回答1:

Your main function will have to be implemented using unsafe code in order to use raw pointers. Raw pointers allow you to bypass Rust's aliasing rules. You can then have two functions that act as safe façades for this unsafe implementation.

unsafe fn foo(src: *const u8, dst: *mut u8, len: usize) {
    for i in 0..len - 1 {
        *dst.offset(i as isize) = *src.offset(i as isize + 1) * 2;
    }
}

fn foo_inplace(buf: &mut [u8]) {
    unsafe { foo(buf.as_ptr(), buf.as_mut_ptr(), buf.len()) }
}

fn foo_separate(src: &[u8], dst: &mut [u8]) {
    assert!(src.len() == dst.len());
    unsafe { foo(src.as_ptr(), dst.as_mut_ptr(), src.len()) }
}

fn main() {
    let src = &[0, 1, 2, 3, 4, 5];
    let dst = &mut [0, 0, 0, 0, 0, 0];

    let buf = &mut [11, 22, 33, 44, 55, 66];

    foo_separate(src, dst);
    foo_inplace(buf);

    println!("src: {:?}", src);
    println!("dst: {:?}", dst);
    println!("buf: {:?}", buf);
}

as_ptr(), as_mut_ptr() and len() are methods on slices.

回答2:

No, you cannot do so in safe Rust. You can use unsafe code to work around aliasing limitations if you wish to but...

but it has a clever optimization that allows the input and output buffer to be the same

what you call an optimization, I call a pessimization.

When the two buffers are guaranteed not to be the same, the optimizer can vectorize your code. It means 4x or 8x less comparisons for the loop, greatly speeding up the execution for larger inputs.

In the absence of aliasing information, however, it must pessimistically assume that the inputs could be aliased and therefore cannot do such optimization. Worse, not knowing how they are aliased, it does not even know whether &dst[i] == &src[i-1] or &dst[i] == &src[i] or &dst[i] == &src[i+1]; it means pre-fetching is out etc...

In safe Rust, however, this information is available. It does force you to write two routines (one for a single input, one for two inputs) but both can be optimized accordingly.

回答3:

Rust does not allow you to parameterize over mutabilty, no.

In theory, you could write some unsafe code that aliases pointers, but you'd have to use raw pointers directly.

&mut implies that the pointer is not aliased, and the optimizer will treat it as such. Using one raw pointer and one &mut pointer can still cause problems.

回答4:

You can use a macro to achieve this in safe code. It'll work for all arguments that have a len function and support indexing. This is basically duck-typing.

macro_rules! inplace(
    ($a:ident, $b:ident) => (for i in 0..($a.len()-1) {
        $a[i] = $b[i + 1] * 2;
    })
);

fn main() {
    let mut arr = [1, 2, 3, 4, 5];
    inplace!(arr, arr);
    println!("{:?}", arr);
}

outputs

[4, 6, 8, 10, 5]

来源：https://stackoverflow.com/questions/30406852/how-to-use-unsafe-aliasing

标签

rust

strict-aliasing

borrow-checker