aspose file tools*
The moose likes Java in General and the fly likes Idiom => Item 27 from Bloch's book Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Idiom => Item 27 from Bloch Watch "Idiom => Item 27 from Bloch New topic
Author

Idiom => Item 27 from Bloch's book

Pho Tek
Ranch Hand

Joined: Nov 05, 2000
Posts: 761

Bloch's Item 27, "Return zero--length arrays, not nulls" advices that one should not return null when the return value if an array. The benefit is that you will not need to do a check for null in the client code. Very logical.

I propose the following two corollaries. I'd like to hear your criticisms and comments from your experience

Item 27a: Return empty collections, not nulls.
Given a method signature like these:

In the event that the methods have "nothing" to return (i.e. empty collections), they should NOT return null but Collections.EMPTY_LIST, Collections.EMPTY_MAP, Collections.EMPTY_SET.


Item 27b: Return a Proxy Class, not nulls.
Given a method signature like this:

We want an alternate way of informing the client (vis-�-vis null checks) that we found nothing that matches the search parameter - in this case, lifeId. The Proxy Class will have a "valid" flag.

// Life is an interface!
if (!Life.isValid()) {
// handle the problem e.g. retries etc...
}


What do you think ?

Pho


Regards,

Pho
Jeroen Wenting
Ranch Hand

Joined: Oct 12, 2000
Posts: 5093
Personally I don't believe in not using null to indicate no data, after all that's exactly what null means...

In the case that you're only using the result for display purposes it's indeed nice to return an empty collection/array/whatever and I do so, but when the result is used for other purposes I'd now need to check whether the size of the collection is zero instead of whether the collection is null.
The latter is a simple comparison of a reference, the latter involves method calls thus making the code harder to read and fractionally slower (more than fractionally slower if there's a lot of such comparisons compared to the length of the code filling the collection and handling the result).


42
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24183
    
  34

What Pho describes, in general, is called the "Null Object Pattern." I suppose you either love it or hate it. The XP and Refactoring folks love it, and I'm one of them. Using null objects this way simplifies a lot of code.

Jeroen objects to the "isValid()" method, and actually so do I. Asking if a Life object isValid() is obviously just as messy as asking whether its null. Much better (and this works best if you've got a well-factored and nicely OO design) is simply to give Life's methods null implementations; i.e.,

[code]
for (int i=0; i<9; ++i) {
Life life = aCat.getLife(i);
life.live();
}

The null Life object could simply have an empty "live()" method.

If you don't have Martin Fowler's "Refactoring", go buy it, and read the section on the Null Object Pattern.


[Jess in Action][AskingGoodQuestions]
Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Originally posted by Jeroen Wenting:
Personally I don't believe in not using null to indicate no data, after all that's exactly what null means...
Yes and no, depending on your semantics. Yes, that's exactly what it means, but no, that's not at all what Bloch's item is about. There's a subtle difference between "an empty data set" and "no data". The problem is that an awful lot of developers confuse the two and use a null value inappropriately.

Bloch's getCheeses() example is essentially a finder method. If the finder doesn't find anything (no cheese in stock), it should return a zero-element array (or empty Collection, or...). Not a null. Never a null. You might return a null if you never stock cheese at all, although an exception would be a better implementation choice in that case. But if you simply find no cheese, then that's a legitimate result and there's no reason whatsoever to represent that in a different way.

A good example of appropriate use of null is HttpServletRequest.getParameter. An empty data set (a parameter that does exist, but is empty) is represented by an empty string, whereas no data (no parameter of the given name does exist) is represented by a null value.

The null object pattern is similar. Bloch seems to be thinking mainly about data. The null object pattern is often about behaviour. For example, in our company we use a caching framework that implements a Cache interface. Sometimes we want to disable caching altogether. We could simply assign null to our caches, but that would litter the code with tests whether the cache is null or not. A waste of space and a source of bugs. Rather, we use a NullCache class that performs no caching. Other than that, it's a valid implementation of the Cache interface and satisfies the Cache contract; it's a drop-in replacement wherever you use a Cache.

The latter is a simple comparison of a reference, the latter involves method calls thus making the code harder to read and fractionally slower (more than fractionally slower if there's a lot of such comparisons compared to the length of the code filling the collection and handling the result).
Not so. The JVM will probably inline such methods if called often - and it's very, very unlikely that it would be a bottleneck in the first place. And code that uses nulls inappropriately is in my experience bulkier, harder to read and more bug-prone. Code that handles an array (or collection) will usually work fine even when you stick in a zero-length result. If you'd use nulls, you better stick in checks everywhere, in every code path no matter how obscure, or it's NPE galore.

- Peter
[ June 02, 2004: Message edited by: Peter den Haan ]
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
I agree the empty set is so much easier to deal with.

Joining the off-topic thread on null object ... here's an example I came up with during a training class. Recommended in training materials for a class that can run in CMT or BMT:

One of the rules of QualityJavaCode is no duplication. All those tests for null should raise a red flag. We have to do this many times in many classes, every time an opportnity to forget and cause a serious fault. Here comes the null object to save the day:

No tests for null. And you can quickly guess exactly what NullTransaction does in each of those methods: absolutely nuthin! Is that cool or what?


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Pho Tek
Ranch Hand

Joined: Nov 05, 2000
Posts: 761

I don't really want to do nothing.

But I think I can adapt the NullObject pattern to do something

under the covers. Thanks.



Regards,

Pho
Manish Hatwalne
Ranch Hand

Joined: Sep 22, 2001
Posts: 2578

Just posting to say that I loved this discussion.
It's pity that Martin Flower's "Refactoring" is not available here in India.

- Manish
David Hibbs
Ranch Hand

Joined: Dec 19, 2002
Posts: 374
Originally posted by Peter den Haan:
We could simply assign null to our caches, but that would litter the code with tests whether the cache is null or not. A waste of space and a source of bugs.


I fail to see how checking for nulls would be a source of bugs--I hope you are implying that the *setting* of the cache to nulls rather than the null checks themselves.

IMHO, it is a good idea to always check for nulls--especially when someone other than you is going to be maintaining the code (or the API you are using). Let's face it--who gets to build new systems? The experienced developers. Who has to maintain them? Generally speaking, it's the newcomers, fresh out of training. 2 years from now, some newcomer to your code will go make an enhancement or fix and change the code to return a null in some obscure case because they don't know your pattern. Or a different, pluggable layer (i.e. a different JDBC driver or messaging system) will be implemented to replace the current one. The code will suddenly blow up and they won't have a clue why.


And code that uses nulls inappropriately is in my experience bulkier, harder to read and more bug-prone. Code that handles an array (or collection) will usually work fine even when you stick in a zero-length result. If you'd use nulls, you better stick in checks everywhere, in every code path no matter how obscure, or it's NPE galore.


Bulkier maybe, but how hard is it to read (myObj!=null) ?!?!
While I agree that it's best to return an empty set or NullProxy or something similar, it's better to be safe when you don't know what a method might return or what some vendor might do. Simply put, it's called defensive programming. NOT checking things like this has gotten Microsoft into trouble many times. Granted, C is much more dangerous in this regard than Java, but why leave it to chance?


"Write beautiful code; then profile that beautiful code and make little bits of it uglier but faster." --The JavaPerformanceTuning.com team, Newsletter 039.
Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Originally posted by David Hibbs:
I fail to see how checking for nulls would be a source of bugs--I hope you are implying that the *setting* of the cache to nulls rather than the null checks themselves.
Sticking with the example, the very fact that the cache might be null and you have to check for that whenever you use it is a source of bugs - after all, you only need to forget once, in some obscure corner of your code. All that additional clutter won't help either.

IMHO, it is a good idea to always check for nulls--especially when someone other than you is going to be maintaining the code (or the API you are using).
I beg to differ. Paranoia programming like this is a very good idea on external interface points[1], but a bad habit otherwise. Rather, establish clear contracts. I have a simple understanding with the team: if you're not clear about the contract, read the javadoc. If that fails, read the unit tests. If the contract is ambiguous, disambiguate it and document the fact.

And no, we're not seeing NullPointerException galore. Not at all. Even though we don't take things to extremes like these.

2 years from now, some newcomer to your code will go make an enhancement or fix and change the code to return a null in some obscure case because they don't know your pattern.
If your newcomer tucks into the code without first examining the contract then God help you. Or, hopefully, your unit tests may help you. You'll have much bigger problems than a null check.

In my experience, paranoia coding does not help against incompetence. It merely increases the volume of your code, making it harder to understand and more prone to bugs. API usage probably becomes more ambiguous as well; different developers will develop different habits, nobody understands the crucial qualitative difference between "new Foo[0]" and "null" anymore, and you end up in exactly the situation that Josh Bloch rightly argues against.

Clear semantics, well-defined contracts, lean and coherent code. These do miracles for maintainability.

Bulkier maybe, but how hard is it to read (myObj!=null) ?!?!
If only it were so simple. How much harder isTo read thanAnd how much harder once it's embedded in a method with other code, other code that itself bloats in similar ways? How much harder is it if, while modifying code, you have to keep track of where you are within a null check (so you don't need to check again) and outside one? Of course a single simple null check does not represent significant complexity, I would argue that the cumulative effect is definitely significant.

- Peter

[1]When using other code, especially when contracts aren't clear, it might return null when it shouldn't and it's best to anticipate that. For client code, throw an IllegalArgumentException with a clear message if it tries to use a null where the contract prohibits it..
David Hibbs
Ranch Hand

Joined: Dec 19, 2002
Posts: 374
Originally posted by Peter den Haan:

Sticking with the example, the very fact that the cache might be null and you have to check for that whenever you use it is a source of bugs - after all, you only need to forget once, in some obscure corner of your code.


Aha, I think I see the source of our disagreement. Your discussion is about the cache itself, while mine is about the value returned from the cache.

If you have an accessible variable in your code such as the cache, I understand the desire to avoid checks. But if you come to a point where the cache returns null you need to check for it, i.e.

has a completely different significance. This implies that cache is a true cache that can hit or miss and is, in truth, ignorant of how to create the objects it contains--this is why you have a factory, I would assume.

(On a tangent, wouldn't it be be wiser to have a factory first go to the cache before creating a new object and query the factory rather than querying the cache? This allows you to change caching rules at the factory level rather than the application level, amont other things...)

I agree that, in an ideal situation in a perfect world, clear semantics, well-defined contracts, javadoc, and unit tests should avoid these issues. However, here are some problems:
Javadoc and unit tests can be changed.
Javadoc has a nasty habit of becoming out of date without close watch.
Unit tests are only good if run.
Unit tests can be changed.
Unit tests are only as good as they were originally written.


[1]When using other code, especially when contracts aren't clear, it might return null when it shouldn't and it's best to anticipate that.


Unless I misread my writing, this is exactly what I was saying. "IMHO, it is a good idea to always check for nulls--especially when someone other than you is going to be maintaining the code (or the API you are using)" although perhaps I should qualify this a bit as saying "always check return values for nulls."

As was previously stated, "different developers will develop different habits" and "nobody understands the crucial qualitative difference between "new Foo[0]" and "null" anymore" -- so why should I trust those maintaining the API to maintain a contract that they may think is outdated?
Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Originally posted by David Hibbs:
[...] This implies that cache is a true cache that can hit or miss and is, in truth, ignorant of how to create the objects it contains--this is why you have a factory, I would assume. On a tangent, wouldn't it be be wiser to have a factory first go to the cache before creating a new object and query the factory rather than querying the cache?
Exactly the other way around - the cache knows what factory to use in case of a cache miss, so that (1) client code can simply call cache.get(key) and not care if it's a cache hit or miss and (2) factory code can simply create the object and not care about caching at all. Have the cache do as much as possible. The code I showed illustrates what you'd end up doing if the cache were no in lieu of using a NullCache: call the factory by hand.

This allows you to change caching rules at the factory level rather than the application level, amont other things...)
Each object type typically has its own cache - but multiple caches can share a single strategy. But we digress. Badly

Javadoc and unit tests can be changed.
As soon as you do that you're changing the contract. Doing without assessing the impact of the change and making corresponding modifications to client code is a recipe for disaster. Really, I don't see how this is germane to the entire null issue. It is just one out of the million ways a contract might be changed. Why single it out for your worries? What makes it so special?

Javadoc has a nasty habit of becoming out of date without close watch.
See above. Besides, Checkstyle does a good job keeping a close watch over here

Unit tests are only good if run.
Presumably if you have 'em, you utilise the investment by running 'em

Unless I misread my writing, this is exactly what I was saying. "IMHO, it is a good idea to always check for nulls--especially when someone other than you is going to be maintaining the code (or the API you are using)" although perhaps I should qualify this a bit as saying "always check return values for nulls."
Although we're clearly not as far apart as it seemed at first glance, we're not that close either. This was about third party code, and particularly where the contracts aren't clear: if the contract unambiguously states that a returned value won't be null I'll accept it as such. Which leaves the vast majority of code that wasn't written by me and/or won't be maintained by me, and is thereby miles away from "always check return values for nulls."

[...] why should I trust those maintaining the API to maintain a contract that they may think is outdated?
Why indeed? Because if an API can suddenly change in arbitrary and undocumented ways, you're stuffed. It is too unstable to use.

I'm obviously missing something here. Out of all the things that might change in a contract, why do we specifically have to be anticipate the eventuality that null values might pop out of methods that previously vowed never to return them? Collections might suddenly contain different objects - class hierarchies might change completely - objects might be differently initialised, or not at all - methods might start throwing an UnsupportedOperationException - nested objects could stop being Serializable - casts might stop working - a given method call might be "refactored" to perform a totally different function. Why not anticipate all of these too?

Within the scope of an application, code should do none of these (at least not without proper refactoring in place). It's called "sanity". If you don't have that, your project is dead.

Client code should be distrusted. So yes, you probably do check null arguments and throw a nice exception if you find them; just like you distrust the object types in collections and so on. It's part of making your API robust and user-friendly. That doesn't involve relaxing your contracts though, just being precise in reporting contract violations.

For library code, a level of trust needs to be established on a case by case basis. If you can't trust a library to adhere to its contract, all you can do is junk it. It's unusable. If its contract is ambiguous, anticipate any variations that the contract allows. If its contract is unambiguous and trustworthy, count yourself blessed and use it to write lean code that is easy to understand.

Ultimately, the proof of the pudding is the eating. I appreciate that different environments may give different results. If your colleagues or tools are very sloppy about null values, you may be forced to liberally sprinkle checks around - but I'd say you had a problem that was worth rectifying. We (small to medium-sized projects for external customers, usually with liberal quantities of open-source and some commercial libraries and frameworks) have become quite disciplined about contracts (including null values) and are seeing far fewer NPEs, not more, even though we do few null checks. Our code has also become more robust and easier to understand. For us, it works, which by itself alraedy seems to invalidate "always check return values for nulls" as a categorical statement.

- Peter
[ June 07, 2004: Message edited by: Peter den Haan ]
David Hibbs
Ranch Hand

Joined: Dec 19, 2002
Posts: 374
Originally posted by Peter den Haan:
Exactly the other way around - the cache knows what factory to use in case of a cache miss


Ahh, well, in that case I was confused by simply calling it a cache. It's really more of a cached-result-factory. So yes, I'll agree that the NullCache in your design is an elegant solution.

re: changing doc/tests:

As soon as you do that you're changing the contract. Doing without assessing the impact of the change and making corresponding modifications to client code is a recipe for disaster.


Quite true. The issue that I have in mind is more the case of an company-internal library or framework where the API docs and contract aren't as explicit, either because the doc was outdated originally (and tests not written either due to negligence or the time of writing--JUnit and XP are not particularly old notions, after all). So it for an affected application to be overlooked, particularly if the person maintaining the library is not familiar with every app using the library.

<digression>
BTW, speaking of XP, here's a link you might find interesting... it's titled "The Case Against Extreme Programming" -- http://www.softwarereality.com/lifecycle/xp/case_against_xp.jsp
</digression>

Besides, Checkstyle does a good job keeping a close watch over here.


Here too. However, it is not a perfect answer as the sentence following the @param or @return tag may need changed.


I'm obviously missing something here. Out of all the things that might change in a contract, why do we specifically have to be anticipate the eventuality that null values might pop out of methods that previously vowed never to return them?


The only thing you've missed is that many methods that return values do not explicitly vow never to return nulls...


Which leaves the vast majority of code that wasn't written by me and/or won't be maintained by me, and is thereby miles away from "always check return values for nulls."


OK OK, I give! My initial writing was a bit extreme in that I failed to clearly qualify my "always" assertion. It's easy to miscommunicate in type.

Here's what I meant to say...

Client code should be distrusted. So yes, you probably do check null arguments and throw a nice exception if you find them; just like you distrust the object types in collections and so on. It's part of making your API robust and user-friendly. That doesn't involve relaxing your contracts though, just being precise in reporting contract violations.

For library code, a level of trust needs to be established on a case by case basis.


Trust of internal libraries is something that needs to be established; if someone else might end up touching it without me knowing (not likely around here maybe, thanks to Anthill but certainly not unusual in a large number of shops), it's better to protect against those libraries changing when they do not explicitly state whether they might return nulls. Either that or go change the library yourself...
Roger Chung-Wee
Ranch Hand

Joined: Sep 29, 2002
Posts: 1683
We want an alternate way of informing the client (vis-�-vis null checks) that we found nothing that matches the search parameter - in this case, lifeId.

Would it not be cleaner to throw a NoLivesLeftException? And if you make it a checked exception, the client would be forced to handle or declare it. To handle the exception cannot be a problem as the client must expect it (be able to deal with it).


SCJP 1.4, SCWCD 1.3, SCBCD 1.3
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923

I don't think so.
Searches often find no match.
Using an exception for normal flow control isn't a good style.

A method:

might return a LIVE_INVALID=-1, but in other circumstances, where every integer could be a valid result, only an Exception seems to be a possibility to handle 'nothing'.
can return a 'null' or a special LiveID which means 'nothing', and a Collection
might return null, or an empty Array (or an Array of special LiveIDs meaning nothing???
Solution 3 seems somehow idiotic - you could return an Array of 17 Invalid-LiveIDs with the same reason as one.

I guess if your class is a kind of collection, it should behave similar, and perhaps support the programmer by methods like 'boolean hasMoreLiveIDs ()', to avoid using Exceptions for flowcontrol.


http://home.arcor.de/hirnstrom/bewerbung
Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Fully agreed that an exception is an inappropriate choice for outcomes that are part of normal operation. It is not appropriate here, unless running out of lives can justifiably be considered an exceptional event - for example, arguably, in a game.

Not entirely sure about your third suggestion. If you want to return a collection (or array) of all matching lifeIds, then wouldn't it be more realistic to haveThen obviously if there's nothing matching the criteria, an empty array is the only reasonable thing to return (as per Bloch's item 27).

- Peter
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Idiom => Item 27 from Bloch's book