Is it possible to remove duplicate rows from a text file? If yes, how?
Did come across this issue and had to resolve it myself because the use was particulate to my need. I needed to find duplicate URL's and order of lines was relevant so it needed to be preserved. The lines of text should not contain any double quotes, should not be very long and sorting cannot be used.
Thus I did this:
setlocal enabledelayedexpansion
type nul>unique.txt
for /F "tokens=*" %%i in (list.txt) do (
find "%%i" unique.txt 1>nul
if !errorlevel! NEQ 0 (
echo %%i>>unique.txt
)
)
Auxiliary: if the text does contain double quotes then the FIND needs to use a filtered set variable as described in this post: Escape double quotes in parameter
So instead of:
find "%%i" unique.txt 1>nul
it would be more like:
set test=%%i
set test=!test:"=""!
find "!test!" unique.txt 1>nul
Thus find will look like find """what""" file and %%i will be unchanged.