One of our desktop applications is now over seven years old. At one point, we had a fairly large team; budget cuts have changed that. It's a very popular application with lots of users, and it continues to evolve. Our small team is still actively working on the code.
The application is very modular. It makes heavy use of the command pattern, and has its own embedded language. The commands operate on a user-visible central data structure which is kind of like a filesystem tree.
The full unit test suite -- almost 20,000 tests -- takes 3 minutes to run on a fast machine, and we can live with that.
The functional (application) test suite, however, is really giving us heartburn. There are well over 3000 tests, and they're extremely non-orthogonal. The tests all check much more program output than they should. Many of the tests dump out part or all of that tree data structure before and after running a command, and almost all of them navigate the tree in some way. Now it's become terribly hard to make changes to the default layout of that data structure, and user feedback has put us under strong pressure to do so.
It's gotten to the point where almost any non-trivial change to the program will break hundreds of tests, virtually all of them nothing but "collateral damage." With our tiny team, it's getting to the point where we spend more time fixing broken application tests than anything else -- i.e., fixing the tests, not the code. For what it's worth, almost without exception, the unit test suite catches any problems before the application tests are even run.
So my question: how do you decide when it's time to "cut bait?" When is it OK to delete, say, the 250 oldest, dumbest application tests: the ones that break most often? We just don't have the resources to rewrite them all from scratch right now, although we could slowly add them back in over time as the related parts of the code were modified. We'd do it more wisely this time.
One of Murphy's Corollaries states that whatever integrations tests you jettison will turn out to include the one that would have saved your hide.
You can triage the tests, placing emphasis on the most common and/or critical use cases. However, if the internal structure is that mutable, it's probably an indication that you really need a good and reliable set of integration tests.
Actually, if the internal structure is that mutable, it's probably an alert that something's overly fragile and you're probably spending a lot of time keeping it internally consistent in the actual product itself.
An IDE is no substitute for an Intelligent Developer.
interesting problem (the opposite problem as usual
i know the problem from putting too much implementation knowledge in test-cases. this happened through the wrong use of mock-frameworks, where i more or less re-programmed the class under test by setting-up mock expected behaviour. when i changed production code slightly and refactored things, many test broke though everything was fine from external behaviour view (brittle tests). i solved this by making production classes more loosely coupled (changing and organizing class-dependencies better) and inject more general mocks to classes. this way the set-up step in test-cases was more focused on and things got better. if you are using mocks, maybe this helps also.
if your problem though comes from non-mock verification part of test: 1)you said that a change of data-structure made many test cases break. the reason could be many duplicates inside your asserts (access to data-structure). maybe you could extract respective common bits of data-structure access, so if structure changes you only need to change one code area. 2)maybe an interface introduction of data-access could help. thus you can pass stubs form the point of tests and be saved from changes of production code which is not focus of the test 3)you could create more coarse grained asserts (custom assert pattern) to avoid verification duplication.
generally i would feel very uncomfortable to chuck test cases away. never the less test-cases should only test a little part of the system and each single one should try to "isolate" a behaviour. as you seem to have test-cases testing same things all the time, they are redundant and add more overhead than help. so i would kick them out too.
for the question about "what tests to throw away": -you could look at the set of test-cases which break after trivial changes. after that you could run a code-duplication tool, see which parts show duplication and are good candidates of test redundancy.
i learned a lot about test-organization and test automation with the book xunit Test-patterns. it should give you many insights, too. this book is a real delight! [ May 19, 2008: Message edited by: manuel aldana ]
Rather than throwing away the tests, I'd probably prefer to *really* fix them. That is, every time one breaks because of "collateral damage", I'd not only make it work again, but also try to find a way to make the test at least a little bit more stable against that kind of change.
I'd actually expect that after doing that for some time, I'd not only have some a little bit more stable tests, but also a quite good idea about how to restructure the tests in general to have them make more sense, and still don't loose any value I might get from them.
The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
I'd do something similar to Ilja. Maybe a set of custom asserts so my custom logic was only in that one place. I'd also update the tests to only test the "output they should" in an attempt to make them less brittle.
At the same time, I can understand the desire to get rid of some old useless tests. I think they key would be to agree on what makes up a useless test ahead of time. If the decision gets made when the tests are failing, there is too much temptation to just ignore them. For me, duplicate tests would be my main criteria. For example, we used to have some integration tests that should have been unit tests. Those I would feel comfortable with removing if they were already unit tests. (these were mainly validations)
Overall, I'm antsy about removing tests. Refactoring, yes. Making faster, yes. Removing duplication, yes. Removing tests, not so much.
One thing that might help is to consider what the TestNG creators would say. I tend to disagree with them on "real life means it is ok not to test as much" thing. Personally, I think this means spending the time later when there are problems. But if you are leaning that way, it is probably better to do so in a controlled manner.