How can I get the identification number with each groups?

允我心安 提交于 2020-01-06 07:06:17

问题


The following is a brief of my data sheet,

stnd_y person_id recu_day date 

2002   100       20020929 02-09-29
2002   100       20020930 02-09-30
2002   100       20021002 02-10-02
2002   101       20020927 02-09-27
2002   101       20020928 02-09-28
2002   102       20021001 02-10-01
2002   103       20021003 02-10-03
2002   104       20021108 02-11-08
2002   104       20021112 02-11-12

And, I want to make those as follows

stnd_y person_id recu_day date      Admission

2002   100       20020929 02-09-29  1
2002   100       20020930 02-09-30  2
2002   100       20021002 02-10-02  3
2002   101       20020927 02-09-27  1
2002   101       20020928 02-09-28  2
2002   102       20021001 02-10-01  1
2002   103       20021003 02-10-03  1
2002   104       20021108 02-11-08  1
2002   104       20021112 02-11-12  2

I mean, I want to make a variable for admission frequency personally with recu_day and date (this variables mean the date of hospitalization).

And then, I used the following with sas,

proc sort data=old out=new;
by person_id recu_day;
data new1;
set new;
retain admission 0;
by person_id recu_day;
if recu_day^=lag(recu_day) and(or) person_id^=lag(person_id) then 
admission+1;
run;

And also,

data new1;
set new ;
by person_id recu_day;
retain adm 0;
if first.person_id and(or) first.recu_day then admission=admission+1;
run;

But, those are not working. How can I solve this? Please let me know about this.


回答1:


You're pretty close with the 2nd attempt, but your main problem is that you don't reset admission each time person_id changes.

It's also not necessary to use first.recu_day as this is 1 for every record in your sample data. first.person_id is sufficient as you want to increment the number by 1 if the peson_id hasn't changed from the previous row.

Including recu_day in the by statement is useful however, as this will force an error if the data isn't sorted properly.

data have;
input stnd_y person_id recu_day date :yymmdd8.; 
format date yymmdd8.;
datalines;
2002   100       20020929 02-09-29
2002   100       20020930 02-09-30
2002   100       20021002 02-10-02
2002   101       20020927 02-09-27
2002   101       20020928 02-09-28
2002   102       20021001 02-10-01
2002   103       20021003 02-10-03
2002   104       20021108 02-11-08
2002   104       20021112 02-11-12
;
run;

data want;
set have;
by person_id recu_day;
if first.person_id then admission=0;
admission+1;
run;


来源:https://stackoverflow.com/questions/46076468/how-can-i-get-the-identification-number-with-each-groups

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!