问题
We have several C++ projects that were built from the same codebase. There's a lot of similarities and common code between them but they were developed independently; source was not shared in any way. Classes and files will have been renamed even if the underlying code hasn't changed and individual lines will have been tweaked, changed and replaced.
I'd like to be able to compare the different codebases and find out how much of the code is still the same. It can be fairly high level - % of code that is the same is fine. I also need to be able to automate this process.
Is there a tool that I can run on the codebases and get some sort of report/assessment of how much is common?
回答1:
I don't have much experience with this sort of thing, but it made me think back to my school days when our University would run everyones code through a program to find cheaters. This brought me to the following link:
Source Code Similarity Detection
It names some open source and commercial software that should meet your needs.
回答2:
There is the java tool dude, part of the MOOSE software reengineering toolkit, by Richard Wettel. It is documented in his (masters?) thesis. MOOSE provides much more than just this, you might want to look at his Codecity.
I've used it on java, c#, delphi, xml. It should work ok on c++ too. For large code bases, don't forget to give it enough heap space, and start with a simple similarity metric.
回答3:
It probably does not solve your problem entirely, but if you want to compare/diff/merge sources, i strongly recommend BeyondCompare from
http://www.scootersoftware.com/
Its the best by far. As far as i know its used by the makers of SO as well.
回答4:
See our CloneDR which detects exact and near-miss code duplication. You could apply this across your two systems to see what they share. CloneDR works for a variety of programming langauges, including C++.
来源:https://stackoverflow.com/questions/1461805/how-can-i-compare-similar-codebases