How to remove overlap in numeric ranges (AWK)

无人久伴 提交于 2019-12-06 07:04:33

Ok, since OP confirmed that the B 501 540 is typo, I post my answer :)

awk -v OFS="\t" '/^A/{s[NR]=$2;e[NR]=$3;l=NR}
/^B/{ 
        for(i=1;i<=l;i++){
                if(s[i]==$2){
                        s[i]=$3+1
                        break
                }else if(e[i]==$3){
                        e[i]=$2-1
                        break
                }
        }
        s[NR] = $2; e[NR]=$3
}
END{for(i=1;i<=NR;i++)print ((i<=l)?"A":"B"),s[i],e[i]}
        ' file

test with your file (the typo was fixed):

kent$  awk -v OFS="\t" '/^A/{s[NR]=$2;e[NR]=$3;l=NR}
/^B/{ 
        for(i=1;i<=l;i++){
                if(s[i]==$2){
                        s[i]=$3+1
                        break
                }else if(e[i]==$3){
                        e[i]=$2-1
                        break
                }
        }
        s[NR] = $2; e[NR]=$3
}
END{for(i=1;i<=NR;i++)print ((i<=l)?"A":"B"),s[i],e[i]}
        ' file
    A       33      100
    A       101     160
    A       200     300
    A       541     1100
    A       1200    1249
    A       1301    1318
    A       1810    1919
    B       0       32
    B       500     540
    B       1250    1300
    B       1319    1340
    B       1920    2000

EDIT for 6 columns:

dirty and quick, pls check the below example:

file:

kent$  cat file
A   0       100 1 2 3
A   101     160 4 5 6
A   200     300 7 8 9
A   500     1100 10 11 12
A   1200    1300 13 14 15
A   1301    1340 16 17 18
A   1810    2000 19 20 21
B   0       32  22 23 24
B   500     540 22 23 24
B   1250    1300 22 23 24
B   1319    1340 22 23 24
B   1920    2000 22 23 24

awk :

kent$  awk -v OFS="\t" '{s[NR]=$2;e[NR]=$3}
/^A/{l=NR}
/^B/{ 
        for(i=1;i<=l;i++){
                if(s[i]==$2){
                        s[i]=$3+1
                        break
                }else if(e[i]==$3){
                        e[i]=$2-1
                        break
                }
        }
}
{r[NR]=$4OFS$5OFS$6}
END{for(i=1;i<=NR;i++)print ((i<=l)?"A":"B"),s[i],e[i],r[i]} ' file
A       33      100     1       2       3
A       101     160     4       5       6
A       200     300     7       8       9
A       541     1100    10      11      12
A       1200    1249    13      14      15
A       1301    1318    16      17      18
A       1810    1919    19      20      21
B       0       32      22      23      24
B       500     540     22      23      24
B       1250    1300    22      23      24
B       1319    1340    22      23      24
B       1920    2000    22      23      24
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!