Using Windows/DOS shell/batch commands, how do I take a file and only keep unique lines?

前端 未结 8 1296
刺人心
刺人心 2020-12-06 02:56

Say I have a file like:

apple
pear
lemon
lemon
pear
orange
lemon

How do I make it so that I only keep the unique lines, so I get:



        
相关标签:
8条回答
  • 2020-12-06 03:02

    Use GNU sort utility:

    sort -u file.txt
    

    If you're on Windows and using Git, then sort and many more useful utilities are already here: C:\Program Files\Git\usr\bin\

    Just add this path to your %PATH% environment variable.

    0 讨论(0)
  • 2020-12-06 03:07
    @echo off
    setlocal disabledelayedexpansion
    set "prev="
    for /f "delims=" %%F in ('sort uniqinput.txt') do (
      set "curr=%%F"
      setlocal enabledelayedexpansion
      if "!prev!" neq "!curr!" echo !curr!
      endlocal
      set "prev=%%F"
    )
    

    What it does: sorts the input first, and then goes though it sequentially and outputs only if current line is different to previous one. It could have been even simpler if not for need to handle special characters (that's why those setlocal/endlocal are for).
    It just echoes lines to stdout, if you want to write to file do (assuming you named your batch myUniq.bat) myUniq >>output.txt

    0 讨论(0)
  • 2020-12-06 03:11

    I also used Powershell from the command prompt, in the directory in which my text file is located, and then I used the cat command, the sort command, and Get-Unique cmdlet, as mentioned at http://blogs.technet.com/b/heyscriptingguy/archive/2012/01/15/use-powershell-to-choose-unique-objects-from-a-sorted-list.aspx.

    It looked like this:

    PS C:\Users\username\Documents\VDI> cat .\cde-smb-incxxxxxxxx.txt | sort | Get-Unique > .\cde-smb-incxxxxxxx-sorted.txt
    
    0 讨论(0)
  • 2020-12-06 03:12

    In Windows 10 sort.exe has a hidden flag called /unique that you can use

    C:\Users>sort fruits.txt
    apple
    lemon
    lemon
    lemon
    orange
    pear
    pear
    
    C:\Users>sort /unique fruits.txt
    apple
    lemon
    orange
    pear
    
    0 讨论(0)
  • 2020-12-06 03:14

    You can use SORT command

    eg

    SORT test.txt > Sorted.txt

    0 讨论(0)
  • 2020-12-06 03:17

    The SORT command in Windows 10 does have an undocumented switch to remove duplicate lines.

    SORT /UNIQ File.txt /O Fileout.TXT
    

    But a more bullet proof option with a pure batch file you could use the following.

    @echo off
    setlocal disableDelayedExpansion
    set "file=MyFileName.txt"
    set "sorted=%file%.sorted"
    set "deduped=%file%.deduped"
    ::Define a variable containing a linefeed character
    set LF=^
    
    
    ::The 2 blank lines above are critical, do not remove
    sort "%file%" >"%sorted%"
    >"%deduped%" (
      set "prev="
      for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%sorted%") do (
        set "ln=%%A"
        setlocal enableDelayedExpansion
        if /i "!ln!" neq "!prev!" (
          endlocal
          (echo %%A)
          set "prev=%%A"
        ) else endlocal
      )
    )
    >nul move /y "%deduped%" "%file%"
    del "%sorted%"
    
    0 讨论(0)
提交回复
热议问题