There's a neat trick that I've been using for a while now to keep a handle on Magic Numbers in my code, particularly in my Unit Test code. I got into a good discussion about it again last week so thought I'd write up an article on it.
Jeanne Boyarsky wrote:Good technique. Much better than having a pile of poorly named constants on top of the file.
If you had poorly named constants then with this approach you are just going to replace them with poorly named methods. I don't quite see how this approach is better than having constants.
I do see problems in that
1.) You are still using magic numbers because you are calling the 'annotation' methods with magic numbers. This is a problem because you now can't use code checkers (e.g checkstyle) to report on all magic numbers as such checkers would report these uses as magic numbers since it's impossible to distinguish between uses (where they were 'annotated') and those were they were not calling 'annotation' methods.
2.) If you wanted to refer to e.g lowerBound(1) multiple times in the method or test you would be repeating the value "1" to 'declare' the lowerBound which is an anti-pattern and refactoring problem caused by using magic numbers.
3.) It's not obvious that the method being called just returns the value passed in so one still has to check the method to see.
Better is to define final local variables at the start of the test method (as part of the test setup). That way they are close to the code using them, they are not magic numbers anymore and static analyzers can be applied to the code to weave out real magic numbers, the values are set at one place so there is no repetition, the constants are still named with the final variable name.
Great feedback. Let's explore some ideas for a bit shall we.
So here is the test code with the magic numbers in it. The magic numbers being the "1" and the "10".
The technique I discussed in the linked article was to wrap the magic numbers in methods as a means to give them some context. i.e. a name.
One feedback point was that it was unclear how this approach is better than having badly named constants. I'm going to say that we're sensible enough to give the constants meaningful names, so let's try it
One advantage of this approach is that you can now reuse the constants. The downside for me is that the constants LOOK A BIT SHOUTY.
Another feedback point was to use local variables to hold the values. Here goes
I'm not against this approach at all and I think it is better than using shouty constants. The values are declared close to the assertion which is a good thing. My only gripe is that I've just doubled the number of lines of code in my test.
So all of this has gotten me thinking a bit deeper into this particular example and how I can improve upon it. My initial though is to introduce some DSL (Domain Specific Language) into my test case to help convey the intent. My first step was to abstract away the object creation of Range into a method that provides some context to the parameters.
Then I decided I didn't need the local variable if I inlined it into the assertThat line
But then I didn't think it read very well. The "isInRange(11)" part just didn't look right. So I refactored that method name.
At this point I'm pretty happy with how the test has turned out. The test case is down to a single line of code and I think the intent and language is clear.
While working on the readability of my test I discovered a better name for the "isInRange" method in the production code. Renaming it to "contains" has now made the method more intuitive to use and portrays its intent more clearly. I started off trying to improve my tests, and ended up improving my production code too. For me that's a massive win.
E Armitage: I'm awarding you a cow for inviting me to think quite hard about your feedback.
Joined: Mar 17, 2012
A static code checker will still report magic numbers in your last test for uses of 10 and 11 but won't report them if you declare them as final local variables (unless you exclude the rule) so you would have no way of enforcing that magic numbers are not used in the code if you allow that line.
Some people don't mind test cases not conforming to the same rules as their production code and may actually skip static code analysis on test cases. It's however good to be consistent. If you don't do it for production code then maybe don't do it in the tests as well.
P.S thanks for the cow (no idea what to do with it though)
1. The example is getting in the way of the idea. IMO, it's too trivial of a case to even warrant any more than a few seconds of consideration to clarify. I favor the rule of Least Surprises; as you mentioned, the role each number plays is fairly obvious from the usage and the well-chosen class name "Range" provides good context in this case.
2. Also, the motivation to refactor a "magic number" is not as strong because there's no duplication. Duplication is really the tipping point for me when I decide whether or not to refactor a hard-coded value into a constant (the approach that offers least surprise to anyone coming in fresh to the code). I am not beholden to anything and everything a static analyzer tells me.
3. If you really want to have code that's more DSL-ish, I find a Builder fits the bill in most cases:
4. I also prefer ThoughtWorker and author Neal Ford's approach to naming tests:
Neal's take is that when it comes to tests, clarity trumps convention so the unconventional names are fine because they are easier to read and they document the design more clearly.
I'm a bit disappointed. Seeing the title I expected a solution where each such magic number would have a Java annotation like @MagicNumber("lowerBound") and @MagicNumber("upperBound"), and then there would be an injection of the actual value that lives in a properties file or some other central place.
Junilu, I tend to agree with pretty much everything you have said.
I tried to keep the example fairly trivial so as not to get bogged down in unnecessary detail, but this code will likely be repeated a number of times with different values being used to construct a Range. I'm also not too concerned with the complaints of a static analyser over good readability.
Good call on using the builder pattern. I had not thought about using it for this example, but I remember you talked with me a while back in this thread about using it for another problem I was facing at the time. I do really like how it reads and I would probably take it one step further to abstract away the "new" keyword. Something like:
Totally agree with you on the test naming (un)convention. I would consistently use something along these lines:
I've not heard of Neal Ford before. Going to go look him up.
With respect to the refactoring you did, I agree the method name isInRange makes the code read awkwardly. contains() is an improvement. I personally would have done a search to see what the real-world naming conventions are for something like Range and probably would have settled on includes(), again because it's going to be more familiar and intuitive and it offers the least surprises. I really would question whether rangeBetweenBounds() is any better than range() -- IMO, the name "range" gives enough context for the purpose of the two hard-code values whereas "rangeBetweenBounds" just makes the code more chatty without adding much of anything that intuitiveness can't fill. (Yes, the long test method names may seem chatty but also adds substance).
Some questions that may come up with this iteration:
1. What's the chance that we'll have to change the upper and lower bounds in these tests?
2. Is the duplication of having the testRange created in each test method worth eliminating by extracting it to the setup() method?
3. Do we really want to have more than one assertion in the true case or do we break that out into three different cases? arg_is_lower_bound, arg_is_upper_bound, arg_between_bounds?
4. Is there sufficient motivation to create constants for the duplicated hard-coded values 1 and 10?
5. Is it worth spending more time refactoring this or is it good enough for now because it has few or no surprises?
Simon Brown, in his new book "Software Architecture for Developers" (#sa4d) refers to "Refactor Distractor", the insidious cousin of "Analysis Paralysis" -- know when your code is "good enough" so you can avoid falling into Refactor Distractor's trap.
While reading your posts my mind was running over some other approaches I could take with this, and eventually I reckon you just end up going round in circles. It's nice to be able to give this a name: the Refactor Distractor. I like it.
We've explored quite a lot of different solutions in a short space of time today but I don't think there is a perfect solution. It all depends on the context, and unfortunately when working with trivial nonsense examples like we are here we are missing any larger context in which to work with. Each different approach could be the "best" one for a given domain.
The value here is knowing your options. In the real world you would explore a number of solutions and approaches and then choose the one that works best for your domain, you, and your team.