Text comparison algorithm

前端未结

关注

 6  1293

悲哀的现实 2020-11-27 03:29

We have a requirement in the project that we have to compare two texts (update1, update2) and come up with an algorithm to define how many words and how many sentences have

6条回答

小蘑菇 (楼主)

2020-11-27 04:14
Here are two papers that describe other text comparison algorithms that should generally output 'better' (e.g. smaller, more meaningful) differences:
- Tichy, Walter F., "The String-to-String Correction Problem with Block Moves" (1983). Computer Science Technical Reports. Paper 378.
- Paul Heckel, "A Technique for Isolating Differences Between Files", Communications of the ACM, April 1978, Volume 21, Number 4
The first paper cites the second and mentions this about its algorithm:

Heckel[3] pointed out similar problems with LCS techniques and proposed a linear-lime algorithm to detect block moves. The algorithm performs adequately if there are few duplicate symbols in the strings. However, the algorithm gives poor results otherwise. For example, given the two strings aabb and bbaa, Heckel's algorithm fails to discover any common substring.

The first paper was mentioned in this answer and the second in this answer, both to the similar SO question:
- Is there a diff-like algorithm that handles moving block of lines? - Stack Overflow
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...