how to trim file - remove the columns with the same value

前端 未结 8 2051
天涯浪人
天涯浪人 2021-01-05 09:15

I would like your help on trimming a file by removing the columns with the same value.

# the file I have (tab-delimited, millions of columns)
jack 1 5 9
joh         


        
8条回答
  •  半阙折子戏
    2021-01-05 09:53

    Not fully tested but this seems to work for the provided test set, note that it destroys the original file...

    #!/bin/bash
    
    #change 4 below to match number of columns
    for i in {2..4}; do
        cut -f $i input | sort | uniq -c > tmp
        while read a b; do
            if [ $a -ge 2 ]; then
                awk -vfield=$i '{$field="_";print}' input > tmp2
                $(mv tmp2 input)
            fi
        done < tmp
    done
    
    $ cat input
    jack    1   5   9
    john    3   5   0
    lisa    4   5   7
    
    $ ./cnt.sh 
    
    $ cat input
    jack 1 _ 9
    john 3 _ 0
    lisa 4 _ 7
    

    Using _ to make the output clearer...

提交回复
热议问题