I would like to gain more experience in test driven development but because I'm relatively new to automated unit tests in general I'm facing many problems with this topic.
I've already read some books, tutorials and online articles about unit testing but in my opinion many authors use examples which are way too much oversimplified. It's obviously not to hard to imagine reasonable tests for a simple algorithm which calculates some values depending on one or two integer parameters. But I think in practice the situation is often much more complicated.
A concrete example: I'm currently working on a little project which uses an algorithm to do some object detection within an image file. It's based on an recursive region growing algorithm which detects objects in a JPEG image and returns a collection of the detected objects. OK, this thing is already working fine but for the future I would like to test such things, too. And I've no idea how this could be accomplished reasonably with unit tests. For example I've read that one should not do real I/O in unit tests for performance and other reasons. But how could I get meaningful test results without examining a real test image??
I hope you can see my problems! This was just an example but I think there are many situations where input and output data of some methods are not just as simple as integers or strings. How would you test a method or algorithm like the one I've described above? Is there a way to use something like a "mock image" instead to test such functionality? Or am I missing perhaps a completely different approach?
While unit tests shouldn't touch the disk or network, that doesn't mean you shouldn't write automated tests for that stuff. It's just not unit tests. (I call those things integration tests.)
What I've done before for a piece of code that needed to draw graphs was that I created as small test images as I could think of - shrinked down to just large enough to exhibit an aspect of capability I wanted to test for.
For example, I wrote tests that wrote one "node" onto a canvas and verified that the output had a one-pixel border, which in turn contained the exact bitmap of whatever the label was supposed to say (rendering the expected text using the expected font on a separate canvas and then scanning the actual output for that same bitmap).
Another example of the kind of tests I wrote was whether certain nodes were "connected" correctly when a given graph was rendered. The tests would effectively identify the nodes from the rendered output (identified as "any rectangle with a one-pixel border") and find the existing connections by following any outgoing trails of black pixels, again matching the rectangles into node objects with a bitmap comparison.
The first test took a couple of hours to write if I remember correctly (having to create some of the utilities) but after that it got easier and the resulting tests were quite nice to read.
The above example was also a rather simple one in that the "rules" were quite strict and straightforward to codify. Not all systems deal with such a set of rules and instead require a more fuzzy logic type of approach. For example, a couple of years back a client was developing an application that let the user create and interact with different kinds of maps. The graphics for a map weren't black and white pixels but rather like a screenshot from Google Maps, which obviously makes interpreting the map somewhat more difficult.
I wasn't working on the system with that client so I don't know how they dealt with the maps (I'm guessing with manual tests because their test coverage wasn't too high) but I'd imagine they could've isolated the overlaid components (the logic) from the topology maps (the background) and therefore simplified the interpretation logic.
I'd be curious to hear more about the kinds of things that your software needs to do and how you've planned on testing for them? I imagine that at least in some cases it might be sufficient to generate simplified input images (possibly generated programmatically from your tests) that you could then feed to the object detection logic, asserting that the detected object(s) are the ones you had drawn into the image. Then it's just a question of whether you can get by with those assertions being exact comparisons or whether fuzzy matching is required...
nice to meet you here and thank you very much for your detailed answer!
As luck would have it I'm currently reading a book about test driven development written by someone called Lasse Koskela I'm just at the beginning but by now I have to say that this is definitely one of the very best books about this topic. In particular all the concrete practical examples are very helpful to get a feeling how to do test driven development in practice. I'm really looking forward to read the rest of this great book. Congratulations!
Your answer was very helpful, too. At least now i know, that it's not always possible to use only simple unit tests to test some functionalities. I guess the border between "real" unit tests and your so called integration tests is often somewhat blurry, isn't it? But I suppose it's inevitable to get some experience on my own to fully understand the concepts of test driven development.
Anyway your examples seem to be very similar to the example I've described and I'm glad that you solved the problem in a similar way than I intended to do. The idea to use real but small test images for the proof of concept sounds very reasonable to me.
Do you have a rule of thumb when you call some tests unit tests and when they are more likely integration tests? And how do you treat these two kind of tests differently? Just by running the integration tests less often? Are you using the junit framework for both sorts of tests even so some are not real unit tests?
Of course I'd like to keep you up-to-date about this software project. The image recognition algorithm is just a part of my diploma thesis. There will be other interesting parts like creating 2D and 3D presentations of some data structures or controlling of a Lego Mindstorms robot which will definitely lead to interesting and difficult tests, I think. I would be very glad if I could get back to your help again then
It can't run at the same time as any of your other unit tests
You have to do special things to your environment (such as editing config files) to run it.
With regards to treating the two kinds of tests differently, I just tend to run the integration tests in smaller sets than unit tests. What I mean by this is that when I've got unit tests, I can run dozens and dozens and dozens of them so quickly that it doesn't interfere with my flow. With integration tests that's not the case. Running more than a handful of integration tests can already take so much time (say, more than 5 seconds) that I'd rather just run the most relevant five while I'm developing (test first, of course) and only run the full suite of tests when I think I'm ready to commit the changes to the source repository.
And yeah, I use JUnit for both unit and integration tests.