File comparison using command prompt

只愿长相守 提交于 2020-01-17 00:34:08

问题


I wanted to compare two text files using command prompt and I am using two text files with name abc and xyz. I need unique records in other text file. But the output which I get for some string are going on to second line which breaks my urls into two separate lines is there any way to compare them and get output in the same format which in the existing text file.

fc abc.txt xyz.txt > unique.txt

abc File contains data as below

newsroom.associatedbank.com/News-Releases/Associated-Bank-opens-new-Minocqua-branch-5e1.aspx
newsroom.associatedbank.com/content/default.aspx?NewsAreaId=2&SubjectId=75
newsroom.associatedbank.com/content/default.aspx?NewsAreaId=2&SubjectId=76
newsroom.associatedbank.com/content/default.aspx?NewsAreaId=2&SubjectId=202
newsroom.associatedbank.com/News-Releases/Associated-Bank-finances-12M-for-retail-and-residential-projects-5dc.aspx
newsroom.associatedbank.com/News-Releases/Associated-Banc-Corp-completes-purchase-of-risk-and-benefits-consulting-firm-Ahmann-Martin-Co-5db.aspx
newsroom.associatedbank.com/News-Releases/Associated-opens-new-Rochester-branch-5da.aspx

xyz File contains data as below

newsroom.associatedbank.com/News-Releases/Associated-Bank-opens-new-Minocqua-branch-5e1.aspx
newsroom.associatedbank.com/content/default.aspx?NewsAreaId=2&SubjectId=75
newsroom.associatedbank.com/content/default.aspx?NewsAreaId=2&SubjectId=76
newsroom.associatedbank.com/content/default.aspx?NewsAreaId=2&SubjectId=202
newsroom.associatedbank.com/News-Releases/Associated-opens-new-Rochester-branch-5da.aspx

回答1:


You do not have to download Windows PowerShell 2.0 if you have Windows 7 because it is already installed.

From cmd.exe command line:

powershell Compare-Object -ReferenceObject (Get-Content abc.txt) -DifferenceObject (Get-Content xyz.txt)  –IncludeEqual ^| Out-File -FilePath unique.txt -Width 4096

Notes:

  • remove –IncludeEqual (added for piquancy only);
  • note that | pipe is ^| escaped to pass it to PowerShell, not to treat it in cmd;
  • change -Width 4096 to desired output line length (integer range). Any additional characters are truncated, not wrapped. If you omit this parameter, the width is determined by the characteristics of the host. The default for the Windows PowerShell console is 80 (characters);
  • here is a huge script repository there (the link provided with filter to file manipulation with PowerShell);
  • here is a Compare-Object Cmdlet reference.

To see SideIndicator output format, omit ^| Out-File ... as follows. You should get truncated output on your screen.

powershell Compare-Object -ReferenceObject (Get-Content abc.txt) -DifferenceObject (Get-Content xyz.txt)  –IncludeEqual

Using alias names for Cmdlets and omitting optional parts of PowerShell statements, next command should give the same result:

powershell diff  (type abc.txt)  (gc xyz.txt) -includeequal 



回答2:


"But the output which I get for some string are going on to second line which breaks my urls into two separate lines"

fc has a bug when a line contains more than 127 characters.

It has been hotfixed for Windows XP and Windows Vista but not for Windows 7.

It does not work correctly in Windows 7 (using either the 32 or 64 bit fc.exe) when the command compares files that contain any ASCII or UNICODE records that have more than 127 characters in a record.


Source where are known errors in fc.exe for windows 7 published

I have created two test files xxx.txt and yyy.txt which differ at line nnn, but fc/n reports that they differ at line nnn+1. It appears that fc has split an of the earlier line into two lines. Examining the files with a hex editor shows no trace of end of line characters 0D or 0A at the place where fc is splitting the line. For larger files the reported mismatch locations from fc and the actual lines where the mismatches are occuring get badly out of synch. Is this an already known error in fc, and where is there a published list of such known problems with this program?

...

There are hot fixes for Windows XP and for Windows Vista. I do not see one for Windows 7 .

Article ID: 953930 - The Fc.exe command does not work correctly on a Windows XP-based computer when the two files that you are comparing have the TAB or SPACE character around the 128th byte in a string of characters http://support.microsoft.com/kb/953930

Article ID: 953932 - The Fc.exe command does not work correctly in Windows Vista or in Windows Server 2008 when the two files that you are comparing have the TAB or SPACE character around the 128th byte in a character string http://support.microsoft.com/kb/953932




回答3:


I'd suggest you try

findstr /i /L /x /v  /g:xyz.txt abc.txt > unique.txt

which should report any line in abc.txt that isn't present in xyz.txt (/i ignoring case, /L literally, no regex, /x - exact match, not on part-line /v lines which don't match)

Consequently, any lines in abc.txt that don't appear in xyz.txt will be directed to unique.txt (tks JosefZ)



来源:https://stackoverflow.com/questions/28735795/file-comparison-using-command-prompt

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!