aspose file tools*
The moose likes Java in General and the fly likes use a *semantically* correct type if possible? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Java 8 in Action this week in the Java 8 forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "use a *semantically* correct type if possible?" Watch "use a *semantically* correct type if possible?" New topic
Author

use a *semantically* correct type if possible?

Anonymous
Ranch Hand

Joined: Nov 22, 2008
Posts: 18944
Hi people,


I've got a slightly jazzy question here so that's why I put it into the "advanced" forum. I believe there's not really a boolean answer for it more fuzzy-logic probably.

When you select a type for a variable, do you prefer a more specific type to a general type if the more specific type is *semantically* closer to what you want to express? This is hard to explain, so here's a sample:

I need a java.util.Map which guarantees keeping the insertion order. In my code, it would be perfectly okay to
write:

Everything would compile, and I'm still free to choose a different implementation later on. This may be cool, but if someone changes the "new LinkedHashMap()" to "new Hashtable()", the code is semantically broken because the insertion order is no longer guaranteed.

However, if I chose to write:

i'd make sure the implementation class can't be changed to java.util.Hashtable, for example. Only LinkedHashMaps, which guarantee that the insertion order will be protected, can be used.
This is great but I can't decide to change this to a different implementation later on.

What's your opinion?
Thanks heaps for any replies.


Greetz,
Dennis
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24168
    
  30

In this particular case, there is a Boolean answer: if the code only works with LinkedHashMap, then use LinkedHashMap, not Map. Using Map when you need LinkedHashMap would be like using Object when you need String.

In this particular case, you could say that there's a "missing interface" in the API. For example, there's an interface "SortedMap" which TreeMap implements. Implementing this interface means that the Map iterator will return the keys in sorted order. If your program needed a Map with this property, it would generally be better to use SortedMap than TreeMap as a variable type. This would give you some flexibility, but still ensure proper operation.

Unfortunately, there's no InsertionOrderMap interface, so the only solution in your case is to use LinkedHashMap directly.


[Jess in Action][AskingGoodQuestions]
Anonymous
Ranch Hand

Joined: Nov 22, 2008
Posts: 18944
Good point, Ernest, I was actually looking for an interface like "InsertionOrderMap". This would give me the best of both solutions. Now, as you suggested, I'm using LinkedHashMap directly.
Thanks!

As far as I can see, within the Collection API, there are three crosscutting aspects that should be covered by the Collection interfaces:
(1) element type (values only or key/value mappings)
(2) uniqueness of elements (no duplicate values or no duplicate keys, resp.)
(3) order guarantees (ordered/not ordered) with subaspect "order type" (insertion order vs. access order)
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Ernest Friedman-Hill:
Unfortunately, there's no InsertionOrderMap interface, so the only solution in your case is to use LinkedHashMap directly.


What exactly would be the point of such an interface?

The point of the SortedMap interface is that you can actually do more things with it than with a plain Map.

The only thing you can do with a LinkedHashMap you can't do with a Map seems to be removeOldestEntry. If your code doesn't need to call that method, I don't see any need for letting it know that it's operating on a LinkedHashMap.

Regarding a developer accidentally changing the Map implementation, it wouldn't take much more effort for him to change the type of the variable either. Anyway, I'd rather implement some unit tests to make sure that the code behaves as expected than couple the whole code to the LinkedHashMap.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Anonymous
Ranch Hand

Joined: Nov 22, 2008
Posts: 18944
Ilja,

that's what my question was (partly) about.
An interface "InsertionOrderMap" would be useful even if it doesn't declare any methods at all. It could be used as a "tag" to express the semantic constraint (i.e. insertion order is maintained). It would prevent programmers from using java.util.Map implementations which don't fulfill this constraint.
The messy point about it is that interfaces are used for two different things.
The more important one is type checking by the compiler, and for this an empty ("tagging") interface is pointless. The other thing is semantics and communication between programmers. By using a non-functional interface in our case, we don't restrict programmers to the LinkedHashMap implementation, but we still express the order constraint. Like this, the programmer will see: "ah, I need an InsertionOrderMap here". Like this, the constraint has been communicated to the programmer. And he/she can only use implementations tagged with this interface, so nothing can actually go wrong.
Regarding a developer accidentally changing the Map implementation, it wouldn't take much more effort for him to change the type of the variable either. Anyway, I'd rather implement some unit tests to make sure that the code behaves as expected than couple the whole code to the LinkedHashMap.

According to the basic principle of type/implementation separation, the developer would know he's in trouble if he changed the type. But he wouldn't expect to cause problems by choosing a different implementation.
I wouldn't rely on unit tests in this case. Even a java.util.Hashtable could theoretically maintain the insertion order in 99 out of 100 test runs, so this is a hard case for testing.

Add-on:
Tagging interfaces are actually used sometimes. One example is java.io.Serializable.
Don Kiddick
Ranch Hand

Joined: Dec 12, 2002
Posts: 580
It seems to by using :


you are reducing the flexibilty of your code for a perceived benefit in code readability. By reducing the flexibility, I mean that if you decide to move to a solution that does not require insertion ordering - it could lead to a laborious refactoring if you've referenced LinkedHashMap n times in your program.

I personally value the flexibility over the readability. I'd hope someone wouldn't change the LinkedHashMap to some other sort of Map without due diligence...if your coworkers do things like this then you have much bigger things to worry about than these minutae !

D.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Dennis K�hn:
By using a non-functional interface in our case, we don't restrict programmers to the LinkedHashMap implementation, but we still express the order constraint. Like this, the programmer will see: "ah, I need an InsertionOrderMap here". Like this, the constraint has been communicated to the programmer. And he/she can only use implementations tagged with this interface, so nothing can actually go wrong.


Unless the implementation isn't working, of course...

I wouldn't rely on unit tests in this case. Even a java.util.Hashtable could theoretically maintain the insertion order in 99 out of 100 test runs, so this is a hard case for testing.


I think I would be able to write tests I'd feel quite confident with.

And if I'd see a line such as

Map myMap = new LinkedHashMap();

my first thought would be "why a LinkedHashMap", and my first guess on the answer "because it's needed". With all due respect, a developer who changes that without understanding what the class this code is residing in is doing needs to be shot instantly...

Tagging interfaces are actually used sometimes. One example is java.io.Serializable.


That's true. I don't remember wether I ever used it as a type for a variable, though...

Having said that, I see your point, and actually don't intend to object strongly. I nevertheless don't feel that it is a Boolean answer - there are still pros and cons, as far as I can see.
Warren Dew
blacksmith
Ranch Hand

Joined: Mar 04, 2004
Posts: 1332
    
    1
Don Kiddick:

[I]It seems to by using :


you are reducing the flexibilty of your code for a perceived benefit in code readability. By reducing the flexibility, I mean that if you decide to move to a solution that does not require insertion ordering - it could lead to a laborious refactoring if you've referenced LinkedHashMap n times in your program.[/I]

Er, he changes the declaration of the variable and the constructor. Done. Everywhere else in the code the methods are called on the variable name, which doesn't need to be changed.

I'd hope someone wouldn't change the LinkedHashMap to some other sort of Map without due diligence...if your coworkers do things like this then you have much bigger things to worry about than these minutae !

I don't agree. I think a declaration like:



means that the programmer has thought about it, and any map will do; HashMap might well have been selected because it just happens to be what the IDE suggested first, or happens to be the programmer's default, and it won't break the code to switch to a different kind of Map. In contrast:



means that the programmer has thought about it, and he has reasons for wanting a HashMap rather than some other kind of Map here.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Warren Dew:
In contrast:



means that the programmer has thought about it, and he has reasons for wanting a HashMap rather than some other kind of Map here.


In my experience, it more likely means that the programmer was an absolute beginner...
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Yeah, I tend to agree with Ilja on that last point. Most people who write code like that do it because they don't know any better, unfortunately. It's pretty rare that this occurs as a result of an informed, intelligent choice. (Though it certainly is possible.) If indeed there's some reason why it's really important that the Map be a LinkedHashMap rather than some other implementation, I'd probably put a comment there right next to the declaration, explaining why the LinkedHashMap is important. I don't put a lot of comments in my code (besides javadoc, usually) but something that unusual deserves a good comment. That doesn't guarantee it won't be changed, of course, but it gives you pretty good odds I think. The next thing to do is write some good unit tests that actually depend on the insertion order. That's the best way to enforce something like this, IMO.

[Dennis]: I wouldn't rely on unit tests in this case. Even a java.util.Hashtable could theoretically maintain the insertion order in 99 out of 100 test runs, so this is a hard case for testing.

Ehhh... that's pretty unlikely in my experience. Unless the keys use a really, really poor hashing function.

I do like the idea of an InsertionOrderMap interface here - seems like it would be useful here. I guess it's just not terribly useful in most other situations, so Sun hasn't seen a need for it. Understandable.
[ February 03, 2005: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
David Harkness
Ranch Hand

Joined: Aug 07, 2003
Posts: 1646
Originally posted by Jim Yingst:
I do like the idea of an InsertionOrderMap interface here - seems like it would be useful here. I guess it's just not terribly useful in most other situations, so Sun hasn't seen a need for it. Understandable.
It may also be that the Java APIs have been written by many different people. I've seen some pretty odd code in the Java suorces that made me wonder about the skill level of the coder. Some things get thought through in detail; others are more rushed or don't get reviewed by as many people.

I'd just like to give a big ThumbsUp(tm) on the comment idea. If you seeand change it to a HashMap, you've got some serious explaining to do. And you're buying lunch all next week.
Don Kiddick
Ranch Hand

Joined: Dec 12, 2002
Posts: 580
Warren Dew :

Er, he changes the declaration of the variable and the constructor. Done. Everywhere else in the code the methods are called on the variable name, which doesn't need to be changed.


True. I was thinking of the case where LinkedHashMap is used throughout the program (which I've seen on code I'm working on, with HashMap). You are correct though, there is no need to proliferate the dependency.

[changed CODE to QUOTE tag (I feel with you - happens to me regularly...) - Ilja]
[ February 07, 2005: Message edited by: Ilja Preuss ]
Anonymous
Ranch Hand

Joined: Nov 22, 2008
Posts: 18944

[Dennis]: I wouldn't rely on unit tests in this case. Even a java.util.Hashtable could theoretically maintain the insertion order in 99 out of 100 test runs, so this is a hard case for testing.

[Jim Yingst]: Ehhh... that's pretty unlikely in my experience. Unless the keys use a really, really poor hashing function.


However, in some areas, "unlikely" just ain't enough. I'm working for a company that produces medical equipment. I mustn't use any unit tests that rely on likelihood! In other words, the outcome of all tests must be deterministic. Other areas that come to my mind include vehicle control, power plants, and emergency management systems.
Apart from that, I agree with you.

I just observed that in some statements in this thread, we assume that it will be experienced developers using the code this is about, but think of APIs, OpenSource projects, etc. - what I mean is: you can't presume the people who are gonna use your code are smart. Most probably they will be, but don't rely on it.
Anonymous
Ranch Hand

Joined: Nov 22, 2008
Posts: 18944
Again, thanks everyone for all your interesting statements!
I really enjoyed this discussion.
Greetz
Dennis
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Dennis K�hn:
However, in some areas, "unlikely" just ain't enough. I'm working for a company that produces medical equipment. I mustn't use any unit tests that rely on likelihood! In other words, the outcome of all tests must be deterministic.


The above mentioned test *is* deterministic - it just doesn't cover all possible failure modes.

For code where it is *really* critical, I'd probably add more unit tests to cover more failure modes. If I'd be unsure about the test failing appropriately, I could imagine adding a test that makes sure using a non-LinkedHashMap fails my testcases. And I'd probably have it run every two hours, on every configuration used in production.

I think that would be much more effective than using the type declaration or the comment, though I might use them as complementary practices.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24168
    
  30

We've had this same debate on the XP mailing list, haven't we. On the one hand, you have guys who are used to weakly-typed languages (Smalltalk, Ruby, etc) to whom proving everything with tests seems very natural, and on the other extreme you have C++ guys, who want to do everything with the type system and prove things at compile time. Java guys have the luxury of sitting in the middle, and choosing what's best for a given situation.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Ernest Friedman-Hill:
Java guys have the luxury of sitting in the middle, and choosing what's best for a given situation.


And the opportunity to discuss at lengths when we'd choose what, and why...
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
[Ilja]: The above mentioned test *is* deterministic - it just doesn't cover all possible failure modes.

Well, depends on the class of your keys - how do they implement hashCode()? If they're just using the inherited method from Object, that can be random each time you run, on each different machine. However most classes that override hashCode() do provide nice deterministic behavior, as you say. It's not difficult to get consistent results from the unit tests in this case. I'm not sure where Dennis' "99 out of 100" comment comes from - if we've got a test with Strings as keys, for example, we should be able to get it to either pass consistently, or fail consistently.
[ February 07, 2005: Message edited by: Jim Yingst ]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Jim Yingst:
[QB][Ilja]: The above mentioned test *is* deterministic - it just doesn't cover all possible failure modes.

Well, depends on the class of your keys - how do they implement hashCode()?


You're right - I should have said "can be made deterministic"...

I'm not sure where Dennis' "99 out of 100" comment comes from - if we've got a test with Strings as keys, for example, we should be able to get it to either pass consistently, or fail consistently.


Mhh, he might actually be right, as the iteration order of a HashMap already can vary with different sizes of the map.

Still, if you write a test with, say, 1000 Strings, I would already feel much more confident than with a comment or a more specific variable type.
Warren Dew
blacksmith
Ranch Hand

Joined: Mar 04, 2004
Posts: 1332
    
    1
Jim Yingst:

I'm not sure where Dennis' "99 out of 100" comment comes from - if we've got a test with Strings as keys, for example, we should be able to get it to either pass consistently, or fail consistently.

I think he means you could easily get a Map class that happens to follow insertion order 99 times out of 100, likely including the deterministic case in the unit test, but fails in that other case, which will come up in actual use in the field. He feels that if you are willing to accept only 99% certainty in your unit tests, they aren't doing their job. I agree - accepting such tests would mean that on your 100th refactoring, you add an undetected bug that only surfaces in the field. In my experience, 100 refactorings only takes a couple months.

I don't think this specific situation would happen with Maps from the library, but I can imagine similar situations where it would happen.

Nothing prevents one from creating one's own interface and implementation, and it's pretty easy:

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: use a *semantically* correct type if possible?
 
Similar Threads
Java collections - how frequently do you use them in development ?
Retrieving insertion order of a map
Collections problem
Can we predict the iteration order of hashset?
Chishlom's question