We have a large, multi-platform application written in C. (with a small but growing amount of C++) It has evolved over the years with many features you would expect in a lar
Its much easier to make it more modular first. You can't really unittest something with a whole lot of dependencies. When to refactor is a tricky calculation. You really have to weigh the costs and risks vs the benefits. Is this code something that will be reused extensively? Or is this code really not going to change. If you plan to continue to get use out of it, then you probably want to refactor.
Sounds like though, you want to refactor. You need to start by breaking out the simplest utilities and build on them. You have your C module that does a gazillion things. Maybe, for example, there's some code in there that is always formatting strings a certain way. Maybe this can be brought out to be a stand-alone utility module. You've got your new string formatting module, you've made the code more readable. Its already an improvement. You are asserting that you are in a catch 22 situation. You really aren't. Just by moving things around, you've made the code more readable and maintainable.
Now you can create a unittest for this broken out module. You can do that a couple of ways. You can make a separate app that just includes your code and runs a bunch of cases in a main routine on your PC or maybe define a static function called "UnitTest" that will execute all the test cases and return "1" if they pass. This could be run on the target.
Maybe you can't go 100% with this approach, but its a start, and it may make you see other things that can be easily broken out into testable utilities.
Not sure is it actual or not, but I have small advice here. As I understand, you ask methodological question about incremental non-invasive integration of unit testing into huge legacy code with a lot of stakeholders protecting their swamp.
Usually, first step is to build your testing code independently from all other code. Even this step in long-live legacy code is very complex. I propose to build your testing code as a dynamic shared library with run-time linking. That will allow you to refactor only small piece of code which is undertesting and not whole 20K file. So, you can start covering function by function without touching/fixing all linking issues
Make using tests easy.
I'd start with putting the "runs automatically" into place. If you want developers (including yourself) to write tests, make it easy to run them, and see the results.
Writing a test of three lines, running it against the latest build and seeing the results should be only one click away, and not send the developer to the coffe machine.
This means you need a latest build, you may need to change policies how people work on code etc. I know that such a process can be a PITA with embedded devices, and I can't give any advice with that. But I know that if running the tests is hard, noone will write them.
Test what can be tested
I know I run against common Unit Test philosophy here, but that's what I do: Write tests for the things that are easy to test. I don't bother with mocking, I don't refactor to make it testable, and if there is UI involved i don't have a unit test. But more and more of my library routines have one.
I am quite amazed what simple tests tend to find. Picking the low hanging fruits is by no means useless.
Looking at it in another way: You wouldn't plan to maintain that giant hairball mess if it wasn't a successful product. You current quality control isn't a total failure that needs to be replaced. Rather, use unit tests where they are easy to do.
(You need to get it done, though. Don't get trapped into "fixing everything" around your build process.)
Teach how to improve your code base
Any code base with that history screams for improvements, that's for sure. You will never refactor all of it, though.
Looking at two pieces of code with the same functionality, most people can agree which one is "better" under a given aspect (performance, readability, maintainability, testability, ...). The hard parts are three:
The first point is probably the hardest, and as much a social as an engineering question. But the other points can be learned. I don't know any formal courses that take this approach, but maybe you can organize something in-house: anything from two guys worting together to "workshops" where you take a nasty piece of code and discuss how to improve it.
As George said Working Effectively with Legacy Code is the bible for this kind of thing.
However the only way others in your team will buy in is if they see the benefit to them personally of keeping the tests working.
To achieve this you require a test framework with which is as easy as possible to use. Plan for other developers you take your tests as examples to write their own. If they do not have unit testing experience, don't expect them to spend time learning a framework, they will probably see writing unit testings as slowing their development so not knowing the framework is an excuse to skip the tests.
Spend some time on continuous integration using cruise control, luntbuild, cdash etc. If your code is automatically compiled every night and tests run then developers will start to see the benefits if unit tests catch bugs before qa.
One thing to encourage is shared code ownership. If a developer changes their code and breaks someone else's test they should not expect that person to fix their test, they should investigate why the test is not working and fix it themselves. In my experience this is one of the hardest things to achieve.
Most developers write some form of unit test, some times a small piece of throw-away code they don't check in or integrate the build. Make integrating these into the build easy and developers will start to buy in.
My approach is to add tests for new and as code is modified, sometimes you cannot add as many or as detailed tests as you would like without decoupling too much existing code, err on the side of the practical.
The only place i insist on unit tests is on platform specific code. Where #ifdefs are replaces with platform specific higher level functions/classes, these must be tested on all platforms with the same tests. This saves loads of time adding new platforms.
We use boost::test to structure our test, the simple self registering functions make writing tests easy.
These are wrapped in CTest (part of CMake) this runs a group of unit tests executables at once and generates a simple report.
Our nightly build is automated with ant and luntbuild (ant glues c++, .net and java builds)
Soon I hope to add automated deployment and functional tests to the build.
I have worked on both Green field project with fully unit tested code bases and large C++ applications that have grown over many years and with many different developers on them.
Honestly, I would not bother an attempt to get a legacy code base to the state where units tests and test first development can add a lot of value.
Once a legacy code base gets to a certain size and complexity getting it to the point where unit test coverage provides you with a lot of benefits becomes a task equivalent to a full rewrite.
The main problem is that as soon as you start refactoring for testability you will begin introducing bugs. And only once you get high test coverage can you expect all those new bugs to be found and fixed.
That means that you either go very slowly and carefully and you do not get the benefits of a well unit tested code base until years from now. (probably never since mergers etc happen.) In the mean time you are probably introducing some new bugs with no apparent value to the end user of the software.
Or you go fast but have an unstable code base until all you have reached high test coverage of all your code. (So you end up with 2 branches, one in production, one for the unit-tested version.)
Of cause this all a matter of scale for some projects a rewrite might just take a few weeks and can certainly be worth it.
we have not found anything to ease the transition from "hairball of code with no unit tests" to "unit-testable code".
How sad -- no miraculous solution -- just a lot of hard work correcting years of accumulated technical debt.
There is no easy transition. You have a large, complex, serious problem.
You can only solve it in tiny steps. Each tiny step involves the following.
Pick a discrete piece of code that's absolutely essential. (Don't nibble around the edges at junk.) Pick a component that's important and -- somehow -- can be carved out of the rest. While a single function is ideal, it might be a tangled cluster of functions or maybe a whole file of functions. It's okay to start with something less than perfect for your testable components.
Figure out what it's supposed to do. Figure out what it's interface is supposed to be. To do this, you may have to do some initial refactoring to make your target piece actually discrete.
Write an "overall" integration test that -- for now -- tests your discrete piece of code more-or-less as it was found. Get this to pass before you try and change anything significant.
Refactor the code into tidy, testable units that make better sense than your current hairball. You're going to have to maintain some backward compatibility (for now) with your overall integration test.
Write unit tests for the new units.
Once it all passes, decommission the old API and fix what will be broken by the change. If necessary, rework the original integration test; it tests the old API, you want to test the new API.
Iterate.