问题
My dataset has missing values in the first row of every unique ID. I want to replace this first row with the values in the second row.
My intuition tells me solution involves using _N_
and BY
statement.
My Data set:
ID Var1
1 .
1 12
1 23
1 2
2 .
2 266
2 23
2 2
3 .
3 6
Result I am after:
ID Var1
1 12
1 12
1 23
1 2
2 266
2 266
2 23
2 2
3 6
3 6
回答1:
Use two SET statements. The second SET is for 'lead' processing (as opposed to 'lag'). The data set of the second set statement is the same as the first one but offset by 1 row (firstobs=2)
.
data have;
input ID var1;
datalines;
1 .
1 12
1 23
1 2
2 .
2 266
2 23
2 2
3 .
3 6
run;
data want;
set have;
by id;
set have(firstobs=2 keep=id var1 rename=(id=lead1_id var1=lead1_var1));
if first.id and id=lead1_id then var1=lead1_var1;
drop lead1_id lead1_var1;
run;
回答2:
Try this out: Step1: assigning id wise row number Step2: Using lag function and replacing second value with the first of each id
data have;
input ID Var1;
cards;
1 .
1 12
1 23
1 2
2 .
2 266
2 23
2 2
3 .
3 6
;
run;
data have1;
set have;
by id;
if first.id then sno = 0;
sno+1;
run;
proc sort data=have1 out=have2;
by id descending sno ;
run;
data have3;
set have2;
var2=lag(var1);
if var1 = . then var1 = var2;
run;
proc sort data=have3 out=want(drop=sno var2);
by id sno;
run;
Let me know in case of any queries.
回答3:
something like this
data want( rename=(next_var1=var1));
set have end=eof;
by id notsorted;
if first.id then do;
point = _N_ + 1;
set have (keep= var1 rename= (var1 = next_var1)) point=point;
end;
else do;
next_var1=var1;
end;
drop var1;
run;
来源:https://stackoverflow.com/questions/49849529/replacing-first-row-with-values-in-second-row