SAS - grouping pairs

后端 未结 3 828
面向向阳花
面向向阳花 2020-12-04 03:41

I have two variables ID1 and ID2. They are both the same kinds of identifiers. When they appear in the same row of data it means they are in the same group. I want to make a

3条回答
  •  心在旅途
    2020-12-04 04:12

    Like one commentator mentioned, Hash does seem to be a viable approach. In the following code, 'id' and 'group' is maintained in the Hash table, new 'group' is added only when no 'id' match is found for the entire row. Please note, 'do over' is an undocumented feature, it can be easily replaced with a little bit more coding.

    data have;
        input ID1   ID2;
        cards;
    1     4
    1     5
    2     5
    2     6
    3     7
    4     1
    5     1
    5     2
    6     2
    7     3
    ;
    
    data _null_;
        if _n_=1 then
            do;
                declare hash h(ordered: 'a');
                h.definekey('id');
                h.definedata('id','group');
                h.definedone();
                call missing(id,group);
            end;
    
        set have end=last;
        array ids id1 id2;
        do over ids;
            rc=sum(rc,h.find(key:ids)=0);
    
            /*you can choose to 'leave' the loop here when first h.find(key:ids)=0 is met, for the sake of better efficiency*/
        end;
    
        if not rc > 0 then
            group+1;
    
        do over ids;
            id=ids;
            h.replace();
        end;
    if last then rc=h.output(dataset:'want');
    run;
    

提交回复
热议问题