How can I find the unique lines and remove all duplicates from a file? My input file is
1
1
2
3
5
5
7
7
I would like the result to be:
You could also print out the unique value in "file" using the cat command by piping to sort and uniq
cat file | sort | uniq -u
Use as follows:
sort < filea | uniq > fileb
you can use:
sort data.txt| uniq -u
this sort data and filter by unique values
I find this easier.
sort -u input_filename > output_filename
-u stands for unique.
uniq should do fine if you're file is/can be sorted, if you can't sort the file for some reason you can use awk:
awk '{a[$0]++}END{for(i in a)if(a[i]<2)print i}'
While sort takes O(n log(n)) time, I prefer using
awk '!seen[$0]++'
awk '!seen[$0]++' is an abbreviation for awk '!seen[$0]++ {print}', print line(=$0) if seen[$0] is not zero.
It take more space but only O(n) time.