How to find a similar code fragment?

喜欢而已 提交于 2019-11-28 18:21:46
Razzie

You can use Simian. It is a tool that detects duplicate code in Java, C#, C++, XML, and many more (even plain txt files). It even integrates nicely in a tool like CruiseControl.

Our CloneDR finds duplicate code, both exact copies and near-misses, across large source systems, parameterized by langauge syntax. It supports Java, C#, COBOL, C++, PHP, Python and many other languages.

It accepts a number of parameters to define "What is a clone?", including: a) Similarilty threshold, controlling how similar two blocks of code must be to be declared as clones (typically 95% is good) b) number of lines minimum clone size (3 tends to be a good choice) c) number of parameters (distinct changes to the text; 5 tends to be a good choice) With these settings, it tends to find 10-15% redundant code in virturally everything it processes.

Line-oriented clone detection tools such as Simian can't find cloned code that has been reformatted, but CloneDR will. They may tell that two blocks of code match, but they usually don't show you exactly how they match or where the differences are; CloneDR will. They don't suggest how to abstract the cloned code; CloneDR will.

By virtue of having weaker matching algorithms, they tend to produce more false positives; when you get 5000 clones reported across a million lines, the number of false positives matters a lot.

Based on your example, I'd expect it to find those two fragments (you don't have have point to either one) and note that they are similar if you abstract away the variable names.

Here is the best collection on code clones detection I've seen:

https://web.archive.org/web/20120502162147/http://students.cis.uab.edu/tairasr/clones/literature

There are many programs, but none of them seems to be the best or the most popular. You can think what is the most important for you and find what suits your needs.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!