Replace all zeros in vector by previous non-zero value

后端未结

关注

 6  1319

Matlab/Octave algorithm example:

 input vector: [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ]
output vector: [ 1 1 2 2 7 7 7 7 5 5 5 5 9 ]

The algorithm is

Explanation

You need to create an index vector:

idx = 1:numel(in)  $// = 1 2 3 4 5 ...

And a logical mask, masking all your non-zero values:

mask = logical(in);

This way you get the grid points idx(mask) and grid data in(mask) for the interpolation. The query points idx(~mask) are indices of the zero data. The query data in(~mask) is then "calculated" by next previous neighbor interpolation, so it basically looks in the grid what is the value for the previous grid point. Exactly what you want. Unfortunately the involved functions have a huge overhead for all thinkable cases, thats why it is still slower than Luis Mendo's Answer, though there are no arithmetic calculations involved.

Furthermore one could reduce the overhead of interp1 a little:

F = griddedInterpolant(idx(mask),in(mask),'previous');
in(~mask) = F(idx(~mask));

But there is not too much effect.

in =   %// = out

     1     1     2     2     7     7     7     7     5     5     5     5     9

Benchmark

0.699347403200000 %// thewaywewalk
1.329058123200000 %// GameOfThrows
0.408333643200000 %// LuisMendo
1.585014923200000 %// Dan

Code

function [t] = bench()
    in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);

    % functions to compare
    fcns = {
        @() thewaywewalk(in);
        @() GameOfThrows(in);
        @() LuisMendo(in);
        @() Dan(in);
    }; 

    % timeit
    t = zeros(4,1);
    for ii = 1:10;
        t = t + cellfun(@timeit, fcns);
    end
    format long
end

function in = thewaywewalk(in) 
    mask = logical(in);
    idx = 1:numel(in);
    in(~mask) = interp1(idx(mask),in(mask),idx(~mask),'previous');
end
function out = GameOfThrows(a) 
    pada = [a,888];
    b = pada(find(pada >0));
    bb = b(:,1:end-1);
    c = find (pada==0);
    d = find(pada>0);
    length = d(2:end) - (d(1:end-1));
    t = accumarray(cumsum([1,length])',1);
    out = bb(cumsum(t(1:end-1)));
end
function out = LuisMendo(in) 
    t = cumsum(in~=0);
    u = nonzeros(in);
    out = u(t).';
end
function out = Dan(V) 
    d = double(diff([0,V])>0);
    d(find(d(2:end))+1) = find(diff([0,~V])==-1) - find(diff([0,~V])==1);
    out = V(cumsum(~~V+d)-1);
end

0 讨论(0)

野性不改

2020-12-09 16:40
The following simple approach does what you want, and is probably very fast:
```
in = [1 0 2 0 7 7 7 0 5 0 0 0 9];
t = cumsum(in~=0);
u = nonzeros(in);
out = u(t).';
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
生来不讨喜

2020-12-09 16:46
New in MATLAB R2016b: fillmissing, it does exactly as described in the question:
```
in = [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ];
in(in==0) = NaN;
out = fillmissing(in,'previous');
```
[This new functionality discovered in this duplicate question].
0 讨论(0)
发布评论:

提交评论
- 加载中...

离开以前

2020-12-09 16:58

I think it is possible, let's start with the basics, you want to capture where number is greater than 0:

 a = [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] %//Load in Vector
 pada = [a,888];  %//Pad A with a random number at the end to help in case the vector ends with a 0
 b = pada(find(pada >0)); %//Find where number if bigger than 0
 bb = b(:,1:end-1);     %//numbers that are bigger than 0
 c = find (pada==0);   %//Index where numbers are 0
 d = find(pada>0);     %//Index where numbers are greater than 0
 length = d(2:end) - (d(1:end-1));  %//calculate number of repeats needed for each 0 trailing gap.
 %//R = [cell2mat(arrayfun(@(x,nx) repmat(x,1,nx), bb, length,'uniformoutput',0))]; %//Repeat the value

 ----------EDIT--------- 
 %// Accumarray and cumsum method, although not as nice as Dan's 1 liner
 t = accumarray(cumsum([1,length])',1);
 R = bb(cumsum(t(1:end-1)));

NOTE: I used arrayfun, but you can use accumarray as well.I think this demonstrates that it is possible to do this in parallel?

R =

Columns 1 through 10

 1     1     2     2     7     7     7     7     5     5

Columns 11 through 13

 5     5     9

TESTs:

a = [ 1 0 2 0 7 7 7 0 5 0 0 0 9 0 0 0 ]

R =

Columns 1 through 10

 1     1     2     2     7     7     7     7     5     5

Columns 11 through 16

 5     5     9     9     9     9

PERFORMANCE:

a = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1,10000); %//Double of 130,000
Arrayfun Method : Elapsed time is 6.840973 seconds.
AccumArray Method : Elapsed time is 2.097432 seconds.

0 讨论(0)

攒了一身酷

2020-12-09 16:58

Vector operations generally assume independence of the individual items. If you have a dependence on an earlier item, then looping is the best way to do it.

Some extra background on matlab: In matlab the operations are typically faster not because of vector operations specifically, but because a vector operation simply does the loop in native C++ code instead of through the interpreter

0 讨论(0)
发布评论:

提交评论
- 加载中...