Say I have a file like:
apple
pear
lemon
lemon
pear
orange
lemon
How do I make it so that I only keep the unique lines, so I get:
Use GNU sort utility:
sort -u file.txt
If you're on Windows and using Git, then sort and many more useful utilities are already here: C:\Program Files\Git\usr\bin\
Just add this path to your %PATH% environment variable.
@echo off
setlocal disabledelayedexpansion
set "prev="
for /f "delims=" %%F in ('sort uniqinput.txt') do (
set "curr=%%F"
setlocal enabledelayedexpansion
if "!prev!" neq "!curr!" echo !curr!
endlocal
set "prev=%%F"
)
What it does: sorts the input first, and then goes though it sequentially and outputs only if current line is different to previous one. It could have been even simpler if not for need to handle special characters (that's why those setlocal/endlocal
are for).
It just echoes lines to stdout
, if you want to write to file do (assuming you named your batch myUniq.bat
) myUniq >>output.txt
I also used Powershell from the command prompt, in the directory in which my text file is located, and then I used the cat command, the sort command, and Get-Unique cmdlet, as mentioned at http://blogs.technet.com/b/heyscriptingguy/archive/2012/01/15/use-powershell-to-choose-unique-objects-from-a-sorted-list.aspx.
It looked like this:
PS C:\Users\username\Documents\VDI> cat .\cde-smb-incxxxxxxxx.txt | sort | Get-Unique > .\cde-smb-incxxxxxxx-sorted.txt
In Windows 10 sort.exe has a hidden flag called /unique
that you can use
C:\Users>sort fruits.txt
apple
lemon
lemon
lemon
orange
pear
pear
C:\Users>sort /unique fruits.txt
apple
lemon
orange
pear
You can use SORT command
eg
SORT test.txt > Sorted.txt
The SORT
command in Windows 10 does have an undocumented switch to remove duplicate lines.
SORT /UNIQ File.txt /O Fileout.TXT
But a more bullet proof option with a pure batch file you could use the following.
@echo off
setlocal disableDelayedExpansion
set "file=MyFileName.txt"
set "sorted=%file%.sorted"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^
::The 2 blank lines above are critical, do not remove
sort "%file%" >"%sorted%"
>"%deduped%" (
set "prev="
for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%sorted%") do (
set "ln=%%A"
setlocal enableDelayedExpansion
if /i "!ln!" neq "!prev!" (
endlocal
(echo %%A)
set "prev=%%A"
) else endlocal
)
)
>nul move /y "%deduped%" "%file%"
del "%sorted%"