How to count number of unique values of a field in a tab-delimited text file?

前端未结

关注

 7  723

滥情空心 2020-12-23 14:21

I have a text file with a large amount of data which is tab delimited. I want to have a look at the data such that I can see the unique values in a column. For example,

7条回答

谎友^ (楼主)

2020-12-23 14:33
Here is a bash script that fully answers the (revised) original question. That is, given any .tsv file, it provides the synopsis for each of the columns in turn. Apart from bash itself, it only uses standard *ix/Mac tools: sed tr wc cut sort uniq.
```
#!/bin/bash
# Syntax: $0 filename   
# The input is assumed to be a .tsv file

FILE="$1"

cols=$(sed -n 1p $FILE | tr -cd '\t' | wc -c)
cols=$((cols + 2 ))
i=0
for ((i=1; i < $cols; i++))
do
  echo Column $i ::
  cut -f $i < "$FILE" | sort | uniq -c
  echo
done
```
0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...