SAS - grouping pairs

后端未结

关注

 3  828

面向向阳花 2020-12-04 03:41

I have two variables ID1 and ID2. They are both the same kinds of identifiers. When they appear in the same row of data it means they are in the same group. I want to make a

3条回答

心在旅途 (楼主)

2020-12-04 04:12

Like one commentator mentioned, Hash does seem to be a viable approach. In the following code, 'id' and 'group' is maintained in the Hash table, new 'group' is added only when no 'id' match is found for the entire row. Please note, 'do over' is an undocumented feature, it can be easily replaced with a little bit more coding.

data have;
    input ID1   ID2;
    cards;
1     4
1     5
2     5
2     6
3     7
4     1
5     1
5     2
6     2
7     3
;

data _null_;
    if _n_=1 then
        do;
            declare hash h(ordered: 'a');
            h.definekey('id');
            h.definedata('id','group');
            h.definedone();
            call missing(id,group);
        end;

    set have end=last;
    array ids id1 id2;
    do over ids;
        rc=sum(rc,h.find(key:ids)=0);

        /*you can choose to 'leave' the loop here when first h.find(key:ids)=0 is met, for the sake of better efficiency*/
    end;

    if not rc > 0 then
        group+1;

    do over ids;
        id=ids;
        h.replace();
    end;
if last then rc=h.output(dataset:'want');
run;

0 讨论(0)

查看其它3个回答