aspose file tools*
The moose likes OO, Patterns, UML and Refactoring and the fly likes Polymorphism OO Design Question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Bookmark "Polymorphism OO Design Question" Watch "Polymorphism OO Design Question" New topic
Author

Polymorphism OO Design Question

Chris Nappin
Ranch Hand

Joined: Aug 04, 2005
Posts: 36
I run FindBugs on our project code, and one of the warnings I noticed recently was that I am using "instanceof" quite a lot, instead of nice general polymorphic methods.

This leads me to a question - am I structuring my code wrongly or not?

Lets say I have a hierarchy of classes, A, B and C. Not very imaginative, I know.

Any common code can be extracted into A, then inherited into B and C, nice and simple.

Any common behavior (but with different implementations) can be declared as an abstract method in A, implemented in B and C, then called via Polymorphic dispatch. Again, nice and simple.

However suppose I have some behavior in B that is completely unique to B, and would make no sense to exist in C. Then I have a piece of generic code that knows about the whole hierarchy of classes. How can I know whether to trigger some behavior in instances of B, other than using instanceof, then casting and calling the method(s) only present in B?

Ta.


Technical Architect, SCJP, SCWCD
Peer Reynders
Bartender

Joined: Aug 19, 2005
Posts: 2922
    
    5
Originally posted by Chris Nappin:
How can I know whether to trigger some behavior in instances of B, other than using instanceof, then casting and calling the method(s) only present in B?


This is an indication that your abstraction/interface on A needs some work. In the simplest (but not the best) case you could just promote the B method to the A level and have C do nothing when that method is called. However it is more likely that you need to have a good look at the code that uses instanceof. Some of that code may need to go inside of A as a new suitably named method that makes sense for most, if not all subclasses of A. Then inside of that method, call template methods that the subclass can override - in which case B would put its code there, while C ignores it.
Chris Nappin
Ranch Hand

Joined: Aug 04, 2005
Posts: 36
Originally posted by Peer Reynders:


This is an indication that your abstraction/interface on A needs some work. In the simplest (but not the best) case you could just promote the B method to the A level and have C do nothing when that method is called. However it is more likely that you need to have a good look at the code that uses instanceof. Some of that code may need to go inside of A as a new suitably named method that makes sense for most, if not all subclasses of A. Then inside of that method, call template methods that the subclass can override - in which case B would put its code there, while C ignores it.


Thanks for the suggestion. Unfortunately you are saying to move a method unique to B into A, then add a default implementation of that method to A. That default implementation would have to do nothing, since that functionality makes no sense in the other subclasses (e.g. C). I think this may cause more confusion for Junior developers, and bloat A with unrelated concepts?

I have seen other solutions where Enums or ids are defined for each class (B, C) and code checks that instead of using "instanceof", however that seems to be treating the symptom rather than the cause.
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Can you move the code specific to each subtype into the subtype behind a common method?


Or put the code specific to each subtype into its own Strategy? Get the right strategy by some attribute or classname. It might be a singleton or a new instance every time.


Do either of those appeal to you? If there will be new subtypes or changes in their individual behaviors this kind of thing helps you "close" this code for modification and keep it open for extension with new subtypes.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Chris Nappin
Ranch Hand

Joined: Aug 04, 2005
Posts: 36
Originally posted by Stan James:
Can you move the code specific to each subtype into the subtype behind a common method?


I think that would be identical to the previous suggestion?

Originally posted by Stan James:

Or put the code specific to each subtype into its own Strategy?


Stategy is a nice pattern I'd consider if the amount of code involved was non-trivial. However if we are only talking about half a dozen lines of code in class B, then creating extra factories, interfaces and implementations would be a massive overkill?
Peer Reynders
Bartender

Joined: Aug 19, 2005
Posts: 2922
    
    5
Without getting more concrete about your particular implementation it going to be difficult to argue the point.

Have a look at Swich Statement code smell and Polymorphism.

There you see the "nasty" instanceof being used to decide which area calculation formula to use (and which methods need to be called). The solution - have the actual object calculate its own area - the pain goes away.

Your case is probably less obvious. Step back and generalize what that method (the one that uses instanceof) does to the As. You may not be able to push that entire method into the "interface" of A but you may be able to identify a support method that makes sense on all As (similar to getArea, getPerimeter) where the B subclass then can do its own thing.

Agile Skills, Chapter 3: Removing Code Smells PDF has a similar type of discussion.
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Yeah, my choice #1 was awful close to Peer's previous suggestion. I latched onto where he had the possibility of moving some code into A and missed the option to move none into A.

As far as "massive overkill" in #2 it depends on your tolerance for swarms of tiny classes, in your case including some that do nothing at all, and the benefits. My tolerance for those is fairly high compared with adding a new if-else switch for every new subclass. I do some things like this at work where I try to maintain a stable core with the ability to add new products and lines of business in every release. The mapping of argument types to strategies is all in configuration so we rarely touch the core code. YMMV.

Let us know what you settle on!
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Chris Nappin:
Thanks for the suggestion. Unfortunately you are saying to move a method unique to B into A, then add a default implementation of that method to A.
That default implementation would have to do nothing, since that functionality makes no sense in the other subclasses (e.g. C).


Yes. Not the worst solution, in my opinion.

I think this may cause more confusion for Junior developers, and bloat A with unrelated concepts?


The problem with the instanceof operator is that your usage probably violates the Single Choice Principle: When I introduce a new subclass of A, I will probably need to touch - or at least check - all those places that test for the subtype of the A instance. Junior developers might very well forget that and easily introduce bugs that way (as well as experienced developers, I might add).


I have seen other solutions where Enums or ids are defined for each class (B, C) and code checks that instead of using "instanceof", however that seems to be treating the symptom rather than the cause.


I totally agree.

The principle in action here is "Tell, don't ask" - that is, you shouldn't ask an object for it's type or something, and than do different things based on the answer, but simply tell the object to do something for you, and let the object decide how to do it. After all, that's what polymorphism is all about - if you don't use it, we could as well program in a procedural language.

I also agree, though, that the proposed solution might be a little bit simplistic. It's really hard to give more concrete advice without a concrete example, though...


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Andrew Laughlin
Greenhorn

Joined: Jan 02, 2007
Posts: 19
I essentially have the same question as Chris and I'm having trouble putting this together in my head. We know it's desirable to program to an interface. If class B in Chris' example contains a completely unique method, how should the design of the inheritance hierarchy proceed? It seems there are 2 options available.

1) Put everything in class A.
Assuming A is an interface or abstract base class, it becomes the proverbial 'red neck lawn' as we dump all common and unique members in here so base class pointers have access to all derived functionality. Z.DoSomethingUnheardOf() gets put in class A.

2) Do type checking with switch statements in all code that we can get our hands on.

Both options make me throw up a little in my mouth. I thought base classes were for common code and derived classes for more specific code. If this is true, how then do we program to an interface while still having access to unique, derived functionality?


Regards,
Andrew
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Andrew:
I essentially have the same question as Chris and I'm having trouble putting this together in my head. We know it's desirable to program to an interface. If class B in Chris' example contains a completely unique method, how should the design of the inheritance hierarchy proceed? It seems there are 2 options available.


There are quite probably more than just those 2 options available. For example it might well be that the introduction of an abstraction might help. It's kind of hard to discuss this without a concrete example, though. Do you have one?

I thought base classes were for common code and derived classes for more specific code. If this is true, how then do we program to an interface while still having access to unique, derived functionality?


One way to do this would be to find an abstraction for which it makes sense to have an empty default implementation of an operation in the base class, and let subclasses re-implement the operation with their non-empty implementation.
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
Chris Nappin wrote However suppose I have some behavior in B that is completely unique to B, and would make no sense to exist in C. Then I have a piece of generic code that knows about the whole hierarchy of classes. How can I know whether to trigger some behavior in instances of B, other than using instanceof, then casting and calling the method(s) only present in B?

Andrew wrote I essentially have the same question as Chris and I'm having trouble putting this together in my head. We know it's desirable to program to an interface. If class B in Chris' example contains a completely unique method, how should the design of the inheritance hierarchy proceed?

One thing that seems missing from both of these discussions (and it may be because they are theoretical rather than real cases) is why "generic" code would want to call a method specific to class "B" at all?

Let's take a slightly more concrete example and work this through.



From the above we can see that our "generic" code expects to be passed a list of implementations of Converter, and calls the "convert" method on each of the objects as appropriate.

Now we can look at some possible implementations of Converter.



Both of the above are straightforward implementations. They have no extra methods and fit in with the processing in the Generic method.

Now let's introduce a new implementation FileBasedConverter, with an extra method:



The new method is never called in the Generic method. If a FileBasedConverter object is placed in the list and given to Generic.convertString, it will never do any converting, because the "file" member will never be initialized. So we need to solve this in some way.

There are several solutions, some of which have been mentioned earlier in this thread, but also some which don't seem to have been mentioned. First, I'll tackle a few solutions which I don't really recommend:

  • Modify Generic to use instanceof. This is the solution which prompted the initial question. We could add code to Generic.convertString which checks each Converter to see if it is an instanceof FileBasedConverter and calls the prepareFile method if it is. This is clumsy and fragile for lots of reasons already discussed.
  • Modify Converter to add the prepareFile method. This solution was also in the original discussion. This is ugly because it also requires changing the other implementations to add a method which does nothing (and worst of all, is obviously specific to a particular type of converter).
  • Modify FileBasedConverter to automatically call the prepareFile method. This solution initially looks like a good idea. Just add a call to "prepareFile" at the start of "convert". However, this might not be such a good idea in practice. If the FileBasedConverter is already used by other code, then this change might break the other code. It's also pretty inefficient, especially in the case where the file is not available or not readable. Every time "convert" is called, it will try again to read the file, and potentially throw an exception which will be logged. If this convert process happens a lot, then the log file will fill up with unneeded exceptions. In some limited circumstances, though, this (or a variant of it) might be the simplest solution which gets the job done.


  • Now on to a few more solutions. Which (if any) of these is "best" is entirely dependent on the actual situation in which they are considered. They should probably be in the "toolbox", though.

    Add one or more generic methods to Converter.

    We definitely don't want to add a "prepareFile" method, but maybe we have discovered that several kinds of Converter actually need a separate initialization step. Perhaps this is something which ought to be part of Converter, for example:



    This has modified the Converter interface, so the other implementations will also need to change, but this is an opportunity to introduce an abstract base class with a more meaningful name:



    Note that the use of AlwaysReadyConverter is just for convenience and to reduce typing. No code depends on whether a any particular Converter class extends from it, and any defaulted methods from it could be placed in-line in a new class just as easily.


    Create an Adapter

    If we want to leave the Generic method the same, and avoid changing the existing Converter instances, but also don't want to modify the existing FileBasedConverter, we could make an "Adapter" class, for example:



    Has that made any sense?
    [ January 03, 2007: Message edited by: Frank Carver ]

    Read about me at frankcarver.me ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
    Ilja Preuss
    author
    Sheriff

    Joined: Jul 11, 2001
    Posts: 14112
    Originally posted by Frank Carver:

    Has that made any sense?


    Absolutely!
    Ricky Clarkson
    Ranch Hand

    Joined: Jul 27, 2006
    Posts: 131
    I'd like to see a case where the original poster used instanceof, that he thinks might be ok. Then I'd refactor it and post the code here. I only use instanceof/casting when using APIs that don't provide any polymorphic stuff, like when implementing equals(Object).
    Andrew Laughlin
    Greenhorn

    Joined: Jan 02, 2007
    Posts: 19
    Thanks for taking the time to put this together Frank. Your effort is appreciated.

    It seems to me (consider the source) the design method you're presenting could be stated this way: "If classes are tightly coupled, create an abstraction, then decouple. Repeat." Would this be accurate?

    Let me present a concrete example and see how this methodology would apply. I'm currently creating an ID3 tag reader for mp3 files. Hopefully the pseudo code below is clear enough. Composition is used heavily.


    public class MP3File
    {
    ID3Tag objID3Tag;
    }

    public class ID3Tag
    {
    Header objHeader;
    ExtHeader objExtHeader;
    }

    public class Header
    {
    void CommonMethod();
    }

    public class ExtHeader
    {
    void CommonMethod();
    }

    ID3 tags are based on an evolving specification, and version 2 may have features version 1 lacks and vice versa. Consider the release of version 2.

    public class ID3TagV2
    {
    HeaderV2 objHeader;
    ExtHeaderV2 objExtHeader;
    }

    public class HeaderV2
    {
    private int V2Field;

    void CommonMethod();
    void OnlyInV2();
    }

    public class ExtHeaderV2
    {
    void CommonMethod();
    }


    It seems to me, when version 2 and subsequent versions are released, it would be nice if class ID3Tag could accept a Header of any version. The following code shows what a basic inheritance hierarchy might look like.


    public abstract class Header
    {
    void CommonMethod();
    }

    public class HeaderV1 extends Header
    {
    void CommonMethod();
    }

    public class HeaderV2 extends Header
    {
    private int V2Field;

    void CommonMethod();
    void OnlyInV2();
    }

    This seems correct to me as HeaderV2 has an "is a" relationship with abc Header. Now here's the rub...


    // How does the client know which version of the ID3Tag (and therefore what methods are available) it's using?
    public class MainMp3
    {
    void main( /* args ... */ )
    {
    MP3File myMP3 = new MP3File( strFileName );

    // How do I know if I can do this without checking the type?
    int nValue = myMP3.ID3Tag.Header.V2Field;
    // or
    myMP3.ID3Tag.Header.OnlyInV2();

    }
    }


    The design above is based on what makes sense to me, and may be completely off track. It seems there should be a better way, I am just not able to conceive it. How does this example conceptually work with any software project? Say you have an abstract radio class in which some derivatives contain a cd player class. Programming to the radio interface, how do you know if you have a cd player available?


    Best Regards,
    Andrew
    Bear Bibeault
    Author and ninkuma
    Marshal

    Joined: Jan 10, 2002
    Posts: 61628
        
      67

    "Andrew",

    There aren't many rules that you need to worry about here on the Ranch, but one that we take very seriously regards the use of proper names. Please take a look at the JavaRanch Naming Policy and adjust your display name to match it.

    In particular, your display name must be a first and a last name separated by a space character, and must not be obviously fictitious.

    Thanks!
    bear
    JavaRanch Sheriff


    [Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
    Andrew Laughlin
    Greenhorn

    Joined: Jan 02, 2007
    Posts: 19
    Sorry about that. Should be fixed now.
    Ilja Preuss
    author
    Sheriff

    Joined: Jul 11, 2001
    Posts: 14112
    Originally posted by Andrew:
    How does the client know which version of the ID3Tag (and therefore what methods are available) it's using?


    Ideally, he doesn't need to know. What would the client do with an old ID3Tag?

    For example, if V2 would have added a longTitle field, and all the client wants to do is display the title, getLongTitle() of the old version simply could return the normal title.

    Another possible solution is the use of a Visitor. The advantage is that you can't forget to handle new versions when they come up. The disadvantage is that you always have to touch all the Visitors when a new version comes up.
    Frank Carver
    Sheriff

    Joined: Jan 07, 1999
    Posts: 6920
    Ah. ID3, a topic close to my heart - or should that be close to my spleen

    I think that there is still some confusion because of the abstract level of the discussion. To cut through this you need to ask yourself what your code should actually do with each type of object. I'm sure the job of your code is to more than just to accept both kinds of headers - you can do that with boolean accept(Header header) { return true; } - but to produce or achieve something with their contents.

    Let's consider two cases, one where the V2 data is present (in some form) in the V1 header, and one where the V2 data is entirely new.

    For the first case, let's use "song title". In ID3V1, song title is a fixed 30-character field at bytes 2-32 of the header. In ID3V2, the nearest equivalent seems to be the TIT2 tag, with an effectively unlimited length.

    When dealing with this sort of case you have two basic approaches:

  • treat both data items as if they are V1 - i.e. trim/pad to 30 characters, supply 30 blanks if a TIT2 is not present, etc.
  • treat both data items as if they are V2 - i.e. allow for arbitrary length or missing text in your UI/database/whatever, supply a default "length" value of 30 for V1 title data, and return the "song title" field when asked for a TIT2, etc.


  • In code terms, the first approach is equivalent to creating an interface with methods for accessing the V1 data, and coding your V2 implementation to meet that interface. For example:




    The second approach is equivalent to creating an interface with methods matching the V2 "named tag" concept, and coding your V1 implementation to meet that interface, for example:



    Which of those approaches you choose depends on your application. If you have a restriction (such as fixed amount of display space on a player, or a fixed-width style of database) then it might make sense to take the first option. If your application is generally more flexible, it might make sense to take the second option.

    Neither of these approaches is particularly flexible in the face of unknown future requirements, but luckily, ID3V2 is a very flexible format, which should cover a lot of future stuff with no major changes.

    For the case of a field not present at all in the V1 specification (which, let's face it, is almost all of the V2 fields) let's consider one of my personal favourites PCNT. ID3V1 has no concept similar to PCNT (the "play count"). In this case you have a similar choice. If you are building an application which needs to be aware of V2 features, such as tracking the number of times a track has been played, then it makes sense to describe your abstractions in terms of V2 features, and simply return whatever you use for "no such tag present" from a V1 header adapter. If you are building a simple application which only needs the V1 header items, then you can safely ignore all the other header information.

    In the end it all comes down to what you need your application to do. This should drive your tests and abstractions, which in turn should drive your implementations.

    Are we getting anywhere, or am I still missing your point?
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: Polymorphism OO Design Question