问题
Example
40000+lines with guids like this:
GUID: 0981723409871243
Search across all GUID's for duplicates
Example:
GUID: 124432408213
GUID: 08917234071423
GUID: 0189742381
GUID: 08917234071423
GUID: 0817423423
GUID: 124432408213
I have TextFX and Compare but how would I find this part there is 2 124432408213 and 2 08917234071423
out of 40,000 lines with possible duplicates I cant easily detect them I need a way to find duplicates.
It would be to be something like GUID: "Search text after guid" next line then continue search for each GUID...I could write a custom program that can do this but...trying to avoid having to do this TextFX is pretty powerful just don't see a way to do something like this...
I should add a little more info here example:
[block1] guid: ???? more info: ??? [/block1]
this is how each block is formatted..
回答1:
Use TextFx to sort the input lines and keep duplicates. Next do a regular expression search, setting Bookmark Line in the Mark tab. The search text should be ^(GUID:\s*\d+\r\n)\1 then click Mark all**. Next use Menu => Search => Bookmark => Remove unmarked lines to remove everything except the duplicates, or use Menu => Search => Bookmark => Copy Bookmarked Lines and paste the lines where wanted. If there are four or more identical lines then the above may finish with one entry for each pair, another TextFX sort removing duplicates should remove the surplus.
For the [block1] guid: ???? more info: ??? [/block1] case the regular expression is more complicated but ^(\[block1\] guid:\s*\d+ more info:\s*\d+ \[/block1\]\r\n)\1 finds and marks the duplicates in:
[block1] guid: 1234 more info: 5678 [/block1]
[block1] guid: 1235 more info: 5678 [/block1]
[block1] guid: 1235 more info: 5678 [/block1]
[block1] guid: 1236 more info: 5678 [/block1]
[block1] guid: 1236 more info: 5678 [/block1]
On Linux or similar a command such as sort -c inputFileName | grep -v "^\s*1\s" or sort inputFileName | unic -c | grep -v "^\s*1\s" or sort inputFileName | uniq -d should work depending on exactly which commands and options are available.
回答2:
Although my answer can't help you by now... Copy your lines into 2 news tabs, then use TextFX to duplicate sort tab 1 and unique sort tab 2. Then move tab 2 to other view, finally use Compare.
来源:https://stackoverflow.com/questions/16940950/notepad-check-for-duplicate-lines-complex