Where can I find the patience diff implemented?

旧巷老猫 提交于 2019-12-09 04:46:50

问题


It is well-answered on this site that Bram Cohen's patience diff is found in bazaar as the default diff and as an option with git diff, but I am finding it difficult to source an independent standalone program that implements this particular diff algorithm.

For example I'd like to apply patience diff to perforce diffs, and it's pretty clear with the canonical "frobnitz" code example how patience diff is better:

The terminal on the right has invoked the git diff with the --patience flag.

I also have set up the diff-highlight perl script, whose job it is to invert colors on matched-up lines between the first and last different sections of those lines. The left side has an example where this doesn't quite help so much but I'll let it slide because at least there is that semicolon there... Anyway, making improvements to the diff-highlight script is not the subject of this question.

In addition to the question of where to find a standalone patience diff, if anybody knows how to make perforce p4 use an external diff program, that's also something that has to be done.


回答1:


It's perhaps not as ideal as I'd like it, but the solution is perfectly good from a practical perspective (and thats a damn good perspective to have).

git diff --no-index --patience file1 file2 does the job. (thanks @StevenPenny)

$P4DIFF variable defines the external diff... we just stuff git diff --patience --no-index into that.




回答2:


I took the liberty of porting patience to a somewhat standalone library, it's in C#. It's still early-days for the library. It is mostly a line-by-line port; so it hopefully has most of the stability of the Python one.

Remember that patience only finds the longest common subsequences (in diff terms that means parts of the file that have not changed). You will need to determine the additions and removals yourself.

Also remember that within the Bazaar repository there are also implementations in Python and in C (again, the implementations only solve the LCS problem):

  • C version: seems to value performance over clarity, you won't easily be able to understand the algorithm by reading this. There is also a lot of code-overhead for Python interop.
  • Python version: the reference implementation of the algorithm. Seems to mostly value clarity over performance.

If you need to write your own implementation I would recommend porting the Python version first, then looking at the C implementation for tips on how to speed it up.

There should also be an implementation in the Git repository, but I haven't searched for it.




回答3:


Cohen's own Python implementation needs only minor tweaks (below) to run standalone. It's in two files, copies of which I snagged by googling "difflib patience":

http://stuff.mit.edu/afs/athena/system/i386_deb50/os/usr/share/pyshared/bzrlib/patiencediff.py and http://stuff.mit.edu/afs/athena/system/i386_deb50/os/usr/share/pyshared/bzrlib/_patiencediff_py.py

The first file can be run from the command line roughly like diff. The second is the Python implementation of the inner loops. (Single file?? Exercise for reader!) In bzrlib there's also a C implementation of the inner loops.

Here (with the help of the program itself) are my patches to make them run standalone:

Sandy$ patiencediff.py --patience orig/patiencediff.py patiencediff.py
--- orig/patiencediff.py
+++ patiencediff.py
@@ -15,14 +15,20 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

+try:
+    from bzrlib.lazy_import import lazy_import
+    lazy_import(globals(), """
+    import os
+    import sys
+    import time
+    import difflib
+    """)
+except:
+    import os
+    import sys
+    import time
+    import difflib

-from bzrlib.lazy_import import lazy_import
-lazy_import(globals(), """
-import os
-import sys
-import time
-import difflib
-""")


 __all__ = ['PatienceSequenceMatcher', 'unified_diff', 'unified_diff_files']
@@ -135,11 +141,18 @@
         PatienceSequenceMatcher_c as PatienceSequenceMatcher
         )
 except ImportError:
-    from bzrlib._patiencediff_py import (
-        unique_lcs_py as unique_lcs,
-        recurse_matches_py as recurse_matches,
-        PatienceSequenceMatcher_py as PatienceSequenceMatcher
-        )
+    try:
+        from bzrlib._patiencediff_py import (
+            unique_lcs_py as unique_lcs,
+            recurse_matches_py as recurse_matches,
+            PatienceSequenceMatcher_py as PatienceSequenceMatcher
+            )
+    except ImportError:
+        from _patiencediff_py import (
+            unique_lcs_py as unique_lcs,
+            recurse_matches_py as recurse_matches,
+            PatienceSequenceMatcher_py as PatienceSequenceMatcher
+            )


 def main(args):
Sandy$ patiencediff.py --patience orig/_patiencediff_py.py _patiencediff_py.py
--- orig/_patiencediff_py.py
+++ _patiencediff_py.py
@@ -15,11 +15,16 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

-
+from __future__ import print_function
 from bisect import bisect
 import difflib

-from bzrlib.trace import mutter
+try:
+    from bzrlib.trace import mutter
+except:
+    import sys
+    def mutter(msg):
+        print (msg, file=sys.stderr)


 __all__ = ['PatienceSequenceMatcher', 'unified_diff', 'unified_diff_files']
Sandy$



回答4:


For a javascript implementation, with enhancements to PatienceDiff to determine the lines that likely moved (dubbed "PatienceDiffPlus"), see https://github.com/jonTrent/PatienceDiff.




回答5:


The Bazaar implementation of patiencediff is also available as a separate Python module.

  • As a GitHub repository: https://github.com/breezy-team/patiencediff
  • On the cheeseshop: https://pypi.org/project/patiencediff/ or pip install patiencediff


来源:https://stackoverflow.com/questions/16066288/where-can-i-find-the-patience-diff-implemented

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!