Can a vector be moved and modified without an extra allocation?

孤街醉人 提交于 2021-02-08 15:10:53

问题


Consider the following code:

let u: Vec<u8> = (64..74).collect();
let v: Vec<u8> = u.iter().map(|i| i + 1).collect();

u was not moved, therefore v was inevitably newly allocated.

But if I do the following:

let w: Vec<u8> = u.into_iter().map(|i| i + 1).collect();

u was moved and w is the name of its transformation. Here is some pseudo-code representing what I mean:

mark u as "moved"
for i = 0..10:
    u[i] += 1
w = u

There is (in my opinion) no need for a new allocation, since we map a type to itself. This wouldn't be the case for this code:

let t: Vec<u8> = (64..74).collect();
let s: String = t.into_iter().map(|i| i as char).collect();

To summarize my question

Is there an allocation of a new Vec when we convert a Vec into an iterator and then map this iterator to an iterator on elements of the same type and then collect the result into a Vec?

And if there is indeed an allocation, why?

I tried to --emit=mir, but I wasn't able to find the answer. I'm using rustc 1.20 nightly (if that matters).

If you want to play with code: Try it online!


回答1:


Let's see the source of the implementation of into_iter() for Vec<T>:

fn into_iter(mut self) -> IntoIter<T> {
    unsafe {
        let begin = self.as_mut_ptr();
        assume(!begin.is_null());
        let end = if mem::size_of::<T>() == 0 {
            arith_offset(begin as *const i8, self.len() as isize) as *const T
        } else {
            begin.offset(self.len() as isize) as *const T
        };
        let cap = self.buf.cap();
        mem::forget(self);
        IntoIter {
            buf: Shared::new(begin),
            cap: cap,
            ptr: begin,
            end: end,
        }
    }
}

Creating the IntoIter iterator incurs several extra allocations, but not for the elements of the vector; instead, the vector's underlying memory details are registered. How about the code behind map()?

fn map<B, F>(self, f: F) -> Map<Self, F> where
    Self: Sized, F: FnMut(Self::Item) -> B,
{
    Map{iter: self, f: f}
}

No extra vectors allocated here either. The last piece of the puzzle is collect():

fn collect<B: FromIterator<Self::Item>>(self) -> B where Self: Sized {
    FromIterator::from_iter(self)
}

No answers here; what about the implementation of from_iter() for Vec<T>?

impl<T> FromIterator<T> for Vec<T> {
    #[inline]
    fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> Vec<T> {
        <Self as SpecExtend<T, I::IntoIter>>::from_iter(iter.into_iter())
    }
}

This is beginning to look like magic, but perhaps the related SpecExtend code will reveal what we're looking for:

impl<T, I> SpecExtend<T, I> for Vec<T>
    where I: Iterator<Item=T>,
{
    default fn from_iter(mut iterator: I) -> Self {
        // Unroll the first iteration, as the vector is going to be
        // expanded on this iteration in every case when the iterable is not
        // empty, but the loop in extend_desugared() is not going to see the
        // vector being full in the few subsequent loop iterations.
        // So we get better branch prediction.
        let mut vector = match iterator.next() {
            None => return Vec::new(),
            Some(element) => {
                let (lower, _) = iterator.size_hint();
                let mut vector = Vec::with_capacity(lower.saturating_add(1));
                unsafe {
                    ptr::write(vector.get_unchecked_mut(0), element);
                    vector.set_len(1);
                }
                vector
            }
        };
        <Vec<T> as SpecExtend<T, I>>::spec_extend(&mut vector, iterator);
        vector
    }

    default fn spec_extend(&mut self, iter: I) {
        self.extend_desugared(iter)
    }
}

In this code we can finally see the Vec::new and Vec::with_capacity methods called which allocate fresh space for the resulting vector.

TL;DR: no, it is not possible to move and modify a vector without an extra allocation.




回答2:


Is there an allocation of a new Vec when we convert a Vec into an iterator and then map this iterator to an iterator on elements of the same type and then collect the result into a Vec?

Yes, the necessary optimization to avoid that allocation is simply too high-level to perform for any of the compiler tiers since they would have to reason about heap allocations and the pointer magic going on in the unsafe code blocks that underpin the Vec implementation. Doubly so in the general case where drop-handlers, panics inside the map and similar things come into play.

It might be possible to optimize at the library level by specializing
collect() -> Vec<T> for specific types such as std::iter::Map<std::vec::IntoIter<T>, _>, but those optimizations would have to be applied on a case-by-case basis where it is known to be safe.

if I understand correctly, your code is approximately the Rust translation of my pseudo-code, without the original "functional style".

For the functional style there is a plan to add for_each to iterators, so if you take an iter_mut something equivalent could be achieved with far fewer heroics on the library or compiler side.

And it would actually reflect the intent of doing something in place more directly instead of just doing some optimization magically which then could surprisingly break under slight modification.



来源:https://stackoverflow.com/questions/45753923/can-a-vector-be-moved-and-modified-without-an-extra-allocation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!