This week's book giveaway is in the OCPJP forum.
We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line!
See this thread for details.
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes Single table / Simple Locking - WeakHashMap vs WeakReferences Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "Single table / Simple Locking - WeakHashMap vs WeakReferences" Watch "Single table / Simple Locking - WeakHashMap vs WeakReferences" New topic
Author

Single table / Simple Locking - WeakHashMap vs WeakReferences

Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi everybody,
Sorry for those guys I confused with my posts where I introduced the WeakReference / ReferenceQueue pair. As my posts on it span multiple threads, the best solution I found to repair is was to post this new thread.
Introduction
Many people here seem to follow Max's idea to use a static WeakHashMap to store locks, as a solution to break deadlocks due to crashed client(s) still owning locks, if they decide it is worth while to handle deadlocks.
It consists to store as map keys objects which uniquely identify the client connection (owner of the lock), and as values the records (recNos) locked. In the solutions I have seen here around, the key is a Data instance.

The main idea of the WeakHashMap solution
As WeakHashMaps store their keys as WeakReferences (objects which are automatically cleared by the garbage collector when there is no strong ("strong" = "normal" ) reference to them anymore - simplified on purpose), and clear their corresponding entries when they are notified (I would better write "when they know" - see below) that one (or more) of their keys have been cleared, it seems a compelling solution to handle crashed clients, as far as the WeakHashMap keys identify clients.
The related design choices
Before digging into the technical issues of that solution, let us have a look at the main design choices the solution implies (they are no issues per se, just prerequisites) :
  • You accept that your database system consists in and supports only one table (the map is static and records are identified as simple recNos). All people here (as far as I know and ... except me ), agree that it is acceptable for our assignment. OK.
  • You decided to handle the "crashed client" situation. Strangely, nearly as many people agree on the fact that there is no need to deal with crashed clients, while there are many posts on the question . Anyway Max, Andrew, Mark, Jim, I mean all the JavaRanch SCJD forum's gurus, agree on that.
  • You accept that, by design, you won't support more than one record lock granted at a time to a given client.
  • Your design is based on multiple Data instances, one per client (any solution based on some Data singleton is incompatible with this solution)


  • It means that locking design has no interest (or at least needs to be adapted) if your database is multi-tables by design because you decided so, or if you decided not to care about lost connections.
    BTW, as my own design is multi-tables, I won't introduce here my own way of locking which is a bit different.
    The issues of the WeakHashMap solution
    I see two issues :
    The key cannot be the recNo
    It must be some object representing the client, which is not natural and couterperformant.
    When some caller claims for a lock, it has to check in a loop if the given recNo is locked or not (waiting if true, returning if false) ... while the map against which this check must be performed offers no optimal way to do it (containsValue() vs containsKey()). As a matter of fact, the key (which is the fast-access data of the map, is even never queried). You'll agree
    with me that it is, by design, a quite bad example of a use of the Map interface.
    Moreover, that's just because of this contraint that a client cannot own multiple locks (a Map key must be unique).
    The lack of notification
    When a client crashes, its entries are cleared from the WeakHashMap. So far so good. But as they are silently, other callers waiting for a lock on the same recNos, wait forever.
    Andrew's partial solution
    Andrew, I say "partial" because it just solves the second issue. Besides the static WeakHashMap, you have a static int locksCount, incremented each time a lock is granted and decremented when it is released. Besides both, you have a Thread dedicated to that task, which will periodically (let's say every second) compare the locksCount value to the WeakHashMap.size(). If they are different, it means that some entry(ies) were cleared from
    the map and you just need to notifyAll() all waiting callers.
    The WeakReference solution
    It is quite similar (and nearly as simple), except both issues mentioned above are solved :
    The key is the recNo (locking is more performant and allows multiple locks to be granted per client), and lock claimers get notified in a similar way Andrew does with its locksCount solution.
    Compared Codes
    As we cannot post complet code solutions, the following code examples are incomplete on purpose (for example exceptions are ignored). Also be aware of the fact that I didn't test the code snippets below : they are posted just for comparison of both techniques. You'll notice that the solution based on WeakReferences is only 3 lines longer than the WeakHashMap solution (70<>73)). Not a big deal !
    WeakHashMap

    WeakReferences registred with a ReferenceQueue

    Best,
    Phil.
    [ November 04, 2003: Message edited by: Philippe Maquet ]
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Phil,
    Before we open the debate, let's establish common assumptions. To point, I'd like to challenge your first assumption.

    The key cannot be the recNo
    It must be some object representing the client, which is not natural and couterperformant.

    Actually, the key can be the record number, depending on your implantation. However, that's beside the point. I disagree with your assertion that this is 'not natural', and I further disagree that Data does not represent a client.
    Each Data is uniquely assigned to a client, per your stated assumption, and thus each Data does, in fact, uniquely represent a client. This is similar to a pattern used by, say, cell phones. When you buy your cell phone, you receive an Object that is unique to you: the phone itself. The system in place, the phone company, doesn't track Phil: they track the object they used to uniquely identify you as a part of their system: namely, your phone, and they indirectly tie that object to you by the protocols of usage. That is, they assume that when a call is made, it's made by the client(Phil). They base their entire business model on it. This is exactly what is happening here. We are assigning a unique object to each client( namely, the Data object), and we are tracking that unique object.
    As to your first point: would you mind spelling out for me what 'couterperformant' means to you? I don't want to start debating a strawman

    When some caller claims for a lock, it has to check in a loop if the given recNo is locked or not (waiting if true, returning if false) ... while the map against which this check must be performed offers no optimal way to do it (containsValue() vs containsKey()). As a matter of fact, the key (which is the fast-access data of the map, is even never queried). You'll agree
    with me that it is, by design, a quite bad example of a use of the Map interface.


    I wold counter this is not a bad thing. A great many business don't [b]want[/b[ their users updating more then a single record at a time. An airline I worked for was this way. There's no justification in assuming that this 'feature' is something the client wants. It may simply be what you want to code
    M


    Java Regular Expressions
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Hi Max,
    Actually, the key can be the record number, depending on your implantation. However, that's beside the point. I disagree with your assertion that this is 'not natural', and I further disagree that Data does not represent a client.

    I think you misunderstood me (my English is quite poor). Of course Data may represent the client, as it is unique per client. What I feel "not natural" with the WeakHashMap solution is that you must store the Data instance as a key instead of a value. WeakHashMap stores its keys as WeakReferences, right ? So I don't understand how you could invert, with a WeakHashMap implementation, the key/value pair (I mean storing recNos as keys and Data instances as values).
    Each Data is uniquely assigned to a client, per your stated assumption, and thus each Data does, in fact, uniquely represent a client. This is similar to a pattern used by, say, cell phones. When you buy your cell phone, you receive an Object that is unique to you: the phone itself. The system in place, the phone company, doesn't track Phil: they track the object they used to uniquely identify you as a part of their system: namely, your phone, and they indirectly tie that object to you by the protocols of usage. That is, they assume that when a call is made, it's made by the client(Phil). They base their entire business model on it. This is exactly what is happening here. We are assigning a unique object to each client( namely, the Data object), and we are tracking that unique object.

    I agree at 100% (see above).
    As to your first point: would you mind spelling out for me what 'couterperformant' means to you? I don't want to start debating a strawman

    Sorry, but I don't understand your sentence I don't know what a strawman is (and I don't find it in my dictionary). I see that I wrote 'couterperformant' instead of 'counterperformant', while the latter, even "corrected", is maybe not an english word : just a free translation of "contre-performant" in french. What I meant is simply this : Map.containsValue() is less efficient than Map.containsKey(). So if recNos are stored as values (the key being a Data), the lock method() is not optimized. Nothing more than that.
    Best,
    Phil.
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Originally posted by Philippe Maquet:
    Hi Max,
    quote:
    --------------------------------------------------------------------------------
    Actually, the key can be the record number, depending on your implantation. However, that's beside the point. I disagree with your assertion that this is 'not natural', and I further disagree that Data does not represent a client.
    --------------------------------------------------------------------------------
    I think you misunderstood me (my English is quite poor). Of course Data may represent the client, as it is unique per client. What I feel "not natural" with the WeakHashMap solution is that you must store the Data instance as a key instead of a value. WeakHashMap stores its keys as WeakReferences, right ? So I don't understand how you could invert, with a WeakHashMap implementation, the key/value pair (I mean storing recNos as keys and Data instances as values).

    Let's let the issue how you could invert the key/value stand for a moment, and focus on the 'unnatural' issue, since it's independent. IMO, it's quite intuitive for a given objet to retain a lock. Or, as the example below shows, for the proxy of that object to retain a lock. This was my point in the example below. The cell phone, in metaphor, is a stand in for a Data object, hopefully demonstrating that, since this is actually the way some things are modeled, it is, in fact, natural.
    In situations where there is a one to one relationship, say between a husband and a wife, it's just as natural to say that the wife owns the husband as it is to say that the husband owns the wife. Of course, I hear this doesn't strictly apply in Paris , but you get my point. Since we haven't established that anything other then a 1-1 solution is called for, I think it's just as natural to say that the client owns the lock, as it is to say that the lock owns the client. If anything, the latter seems somewhat unnatural to me, from an OO prespective. I don't think of my self as belonging to my stuff, I think of them as belonging to me. Of course, this is simply a matter of prespective, and I'm sure one is as good as the other. However, I'm not convinced that one is better then the other.

    quote:
    --------------------------------------------------------------------------------
    Each Data is uniquely assigned to a client, per your stated assumption, and thus each Data does, in fact, uniquely represent a client. This is similar to a pattern used by, say, cell phones. When you buy your cell phone, you receive an Object that is unique to you: the phone itself. The system in place, the phone company, doesn't track Phil: they track the object they used to uniquely identify you as a part of their system: namely, your phone, and they indirectly tie that object to you by the protocols of usage. That is, they assume that when a call is made, it's made by the client(Phil). They base their entire business model on it. This is exactly what is happening here. We are assigning a unique object to each client( namely, the Data object), and we are tracking that unique object.
    --------------------------------------------------------------------------------
    I agree at 100% (see above).

    quote:
    --------------------------------------------------------------------------------
    As to your first point: would you mind spelling out for me what 'couterperformant' means to you? I don't want to start debating a strawman
    --------------------------------------------------------------------------------
    Sorry, but I don't understand your sentence I don't know what a strawman is (and I don't find it in my dictionary).

    A straw man argument is one in which one party misinterprets the points of the other party, and addresses that misinterpretation. For example, if you made a comment about stupid American tourists, and I launched into a tirade about how Americans aren't stupid, I would be making a strawman of your argument: I wouldn't be addressing your point, I would be addressing a point that you didn't make, in order to win that point. I want to avoid doing that, so I asked for your clarification.

    I see that I wrote 'couterperformant' instead of 'counterperformant', while the latter, even "corrected", is maybe not an english word : just a free translation of "contre-performant" in french.

    I only wish I spoke French as well as you speak English.

    What I meant is simply this : Map.containsValue() is less efficient than Map.containsKey().

    Why would you say that?
    M
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    [PM]: What I meant is simply this : Map.containsValue() is less efficient than Map.containsKey().
    [MH]: Why would you say that?

    Maps, Hash- or Tree-, are designed to look thinks up by their keys, and quickly. Implementing contains() is pretty similar to just doing a get() and seeing if it's null or not. For a HashMap, you just calculate the hashCode(), look in the appropriate bucket, and maybe iterate through a reasonably small number of entries that are in that same bucket. It's O(1). For a TreeMap it's O(log(n)) but still pretty quick. In contrast, for containsValue() you have no choice but to iterate through every single entry in the Map, and see if the value matches your target. Best-case scenario you get lucky and find your target early on, and you can abort the loop, but don't count on it. It's an O(n) operation.


    "I'm not back." - Bill Harding, Twister
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Originally posted by Jim Yingst:
    [PM]: What I meant is simply this : Map.containsValue() is less efficient than Map.containsKey().
    [MH]: Why would you say that?

    Maps, Hash- or Tree-, are designed to look thinks up by their keys, and quickly. Implementing contains() is pretty similar to just doing a get() and seeing if it's null or not. For a HashMap, you just calculate the hashCode(), look in the appropriate bucket, and maybe iterate through a reasonably small number of entries that are in that same bucket. It's O(1). For a TreeMap it's O(log(n)) but still pretty quick. In contrast, for containsValue() you have no choice but to iterate through every single entry in the Map, and see if the value matches your target. Best-case scenario you get lucky and find your target early on, and you can abort the loop, but don't count on it. It's an O(n) operation.

    Jim is correct here: I was thinking about fact that both are accessible as Sets, through the Map.keySet and Map.entrySet methods, thus equally fast to look through: I didn't think of the simple existence test. However, as far I know, this is an implementation choice: I'm not aware of any specification based requirement that Maps be implemented in this manner, nor, AFIK, can you depend on them being so on different platforms.
    I would also maintain that speed isn't a factor in this project. Frankly, given the number of records at issue, I doubt the time differential is even measurable. This leads us back to the same point: Namely, that ordering record-to-Data satisfies non existent requirements: namely, speed(probably) and multiple record locking. Thus, IMO, ordering Data-to-record is just as natural.
    M
    John Smith
    Ranch Hand

    Joined: Oct 08, 2001
    Posts: 2937
    Best-case scenario you get lucky and find your target early on, and you can abort the loop, but don't count on it.
    As a matter of fact, the worst-case scenario is more likely, as most of the time the record would not be locked and therefore will not be in the map. Thus, you would have to iterate through the entire collection only to realize that the object you are looking for is not there. This would be a very nonstandard use of a Map, indeed.
    In the app that I work on at the moment, we actually had a similar problem, where an ArrayList was used as a collection which was used to test for a membership. Swapping the ArrayList with the properly structured Map gave us a 500% boost in performance!
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    [Max]: I'm not aware of any specification based requirement that Maps be implemented in this manner, nor, AFIK, can you depend on them being so on different platforms.
    Strictly speaking, that's true. At the generic Map level, the API says containsValue() "will probably require time linear in the map size for most implementations of the Map interface." Nothing is said about containsKey(). Looking at specific implementations however - TreeMap does guarantee O(log(n)) for containsKey(). HashMap doesn't mention getKey()'s performance, but it does guarantee O(1) for get(), and common sense is that constainsKey() should be the same. Neither implementation says anything more about containsValue()'s performance (though TreeMap inherits the API comment from Map here, probably linear time). Unless someone magically comes up with a way to beat O(log(n)) or O(1) for containsValue() on a TreeMap or HashMap respecively, we can be pretty sure containsKey() will offer better performance for these two classes at least.
    As for different platforms - well, HashMap and TreeMap are just using non-native code really, using standard Sun SDKs. The behavior might be different on other JDKs maybe, but not other platforms, IMO.
    I would also maintain that speed isn't a factor in this project.
    This I'll basically agree with, along with the remainder of the paragraph.
    [Eugene]: As a matter of fact, the worst-case scenario is more likely
    Also agreed.
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Many thanks to you, Jim and Eugene, for having entered the discussion to argue about why it's more efficient to access a HashMap by key than by value. "Not natural" in my mind (direct translation from french) meant "counterintuitive and less efficient".
    Anyway, you are far more efficient than I can be to debate with Max. As we don't master the language (english) at the same level, sometimes I feel myself as he would feel himself if he had to kickbox against me with hands and feets bound...
    Now about that Map access discussion, I would argue that it is not a specification question. I use a HashMap (or a WeakHashMap) which are implementations. Any user of them gets their sources. Even if the doc is unclear (I don't pretend that), it's so easy to see (when you are curious as I am) what is the job performed by containskey() and containsValue().
    I passed SCJP in may 2003 (correctly : 95%). One of the things you have to know to pass SCJP is how to recognize, among the few collections offered by the API, which is the best one to be used in a given context. It is one of the easiest parts of SCJP, because those classes are well named. It means that any junior java programer (SCJP level) knows that a HashMap is a Map (key-value pair), where keys are stored in buckets based on their hashCode(), a structure which favours acces by key over by value.
    Well, anyway I initiated this thread just to handle a locking issue, especially to reply to Ulrich's question (I am not concerned by the given issue anyway), there is not any comment about the compared solutions above and ... Ulrich seems to have quit us. Some days, I'd better to go and sleep sooner
    Best,
    Phil.
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    /*
    Originally posted by Philippe Maquet:
    Many thanks to you, Jim and Eugene, for having entered the discussion to argue about why it's more efficient to access a HashMap by key than by value. "Not natural" in my mind (direct translation from french) meant "counterintuitive and less efficient".
    Anyway, you are far more efficient than I can be to debate with Max. As we don't master the language (english) at the same level, sometimes I feel myself as he would feel himself if he had to kickbox against me with hands and feets bound...

    I would never kick or punch you....I'd probably just grab you and throw you . But seriously, you make too much of English Issue: I think you argue well.

    Now about that Map access discussion, I would argue that it is not a specification question. I use a HashMap (or a WeakHashMap) which are implementations. Any user of them gets their sources. Even if the doc is unclear (I don't pretend that), it's so easy to see (when you are curious as I am) what is the job performed by containskey() and containsValue().
    I passed SCJP in may 2003 (correctly : 95%). One of the things you have to know to pass SCJP is how to recognize, among the few collections offered by the API, which is the best one to be used in a given context. It is one of the easiest parts of SCJP, because those classes are well named. It means that any junior java programer (SCJP level) knows that a HashMap is a Map (key-value pair), where keys are stored in buckets based on their hashCode(), a structure which favours acces by key over by value.

    Well, I hope I qualify as a Junior Programmer, and while I 'knew' it, I missed it. It's true that ordered keys, such as hashMaps keys, are much more easy to search then unordered ones(like Hashmap values). However, it bears keeping in mind that Maps provide access to both their Keys and their values as Sets.
    However, this part of the discussion, IMO, is some tangential: it doesn't really matter, because speed is not a consideration in this project. Also, the difference in speed here, given the size of the records, is immeasurable. Thus, the argument reduces to 'ordering records to Data is better then ordering Data to records because it's immeasurably faster, and speed is a non issue in this assignment'. Personally, I'm not convinced by this.

    Well, anyway I initiated this thread just to handle a locking issue, especially to reply to Ulrich's question (I am not concerned by the given issue anyway), there is not any comment about the compared solutions above and ... Ulrich seems to have quit us. Some days, I'd better to go and sleep sooner
    Best,
    Phil.

    I think it's a worthwhile discussion, I just want to clarity the assumptions we'l be working with. So far, I don't think we've established a requirement or OO based reason to map records to Data, while I have offered some examples of why it's intuitive, in somewhat similar cases, to map Data to records.
    All best,
    M
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Hey Jim,
    Originally posted by Jim Yingst:
    [QB][Max]: I'm not aware of any specification based requirement that Maps be implemented in this manner, nor, AFIK, can you depend on them being so on different platforms.
    Strictly speaking, that's true.

    My point is that when you talking about a project designed to run consistently on different OS systems and/or different JVMs, it's important to speak strictly, especially when you're showcasing just what a through and competent and careful software engineer you are. Just my opinion

    .snip...and common sense is that constainsKey() should be the same. Neither implementation says anything more about containsValue()'s performance (though TreeMap inherits the API comment from Map here, probably linear time). ...snip.. we can be pretty sure containsKey() will offer better performance for these two classes at least.

    While this is true, I wouldn't advise using 'common sense' and 'probably' as elements in your design choices document. Again, just my advice. If you can't make a statement about the nature of the APIs that's guaranteed to be platform independent, then I suggest that you not make it.

    As for different platforms - well, HashMap and TreeMap are just using non-native code really, using standard Sun SDKs. The behavior might be different on other JDKs maybe, but not other platforms, IMO.

    Why not? two implementations of the same JDK on two different platforms are simply two different programs. I'm not aware that Sun guarantees that the same algorithm will be used in both cases.

    All best,
    M
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Hi Max,
    Thanks for your reply.
    Also, the difference in speed here, given the size of the records, is immeasurable.

    I suppose that while writing "given the size of the records" you were thinking of the number of locks granted at a given same time. Even if the database has 100000 records, you probably won't have more than an average 100 records locked along them. So yes, the difference in speed will be immeasurable.
    I would just comme back on the context of my first post here :
    The first time I saw a locking solution based on a WeakHashMap mapping Data to records,
  • I asked : Why do you map Data to records instead of records to Data ? It seemed conterintuitive to me and moreover forced to limit the number of locks per Data to only one at a time. My question was stupid because Data (which represents the owner of the lock) must be the key in the WeakHashMap.
  • I pointed out the lack of notification which made its solution buggy.


  • Now, from his answer, I got the feeling that his design had been driven by its code and not the contrary :
    At first he had mapped records to Data (more intuitive for him too), and then he inverted the mapping because the WeakHashMap container forced him to do so. With the consequence that he couln't grant more that one lock per client at the time... Then he concluded that it was acceptable because in his current application he doesn't need more than one lock per client. So coding constraints made his final design.
    I would like to add just one argument in favour of records mapped to Data as WeakReferences in a regular HashMap, against Data mapped to records in a WeakHashMap : avoiding the WeakHashMap overhead. As the latter stores its keys as WeakReferences, it needs an internal Entry class extending WeakReference, mainly to override equals() and hashCode(). While when Data are stored in the map as values, the "basic" WeakReference is enough, the default Object equals() (which is faster) being perfect for our purpose.
    Thank you for all your comments, Max, there were very interesting to me.
    All best,
    Phil.
    [ September 12, 2003: Message edited by: Philippe Maquet ]
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Originally posted by Philippe Maquet:
    Hi Max,
    Thanks for your reply.

    I suppose that while writing "given the size of the records" you were thinking of the number of locks granted at a given same time. Even if the database has 100000 records, you probably won't have more than an average 100 records locked along them. So yes, the difference in speed will be immeasurable.
    I would just comme back on the context of my first post here :
    The first time I saw a locking solution based on a WeakHashMap mapping Data to records,
  • I asked : Why do you map Data to records instead of records to Data ? It seemed conterintuitive to me and moreover forced to limit the number of locks per Data to only one at a time. My question was stupid because Data (which represents the owner of the lock) must be the key in the WeakHashMap.
  • I pointed out the lack of notification which made its solution buggy.


  • Now, from his answer, I got the feeling that his design had been driven by its code and not the contrary :
    At first he had mapped records to Data (more intuitive for him too), and then he inverted the mapping because the WeakHashMap container forced him to do so. With the consequence that he couln't grant more that one lock per client at the time... Then he concluded that it was acceptable because in his current application he doesn't need more than one lock per client. So coding constraints made his final design.

    But this is common practice in all forms of engineering: that's the difference between engineers and scientists!
    Architects frequently limit the size and height of their buildings due to construction materials, local statues, etc. electrical engineers will use one metal over another because it's a better conductor, Chemists with use one element over another because it's cheaper. Doctors prescribe one medicine over another because it's covered by the patient's insurance. It's part of the game of getting along with out environment and it's at the heart of practice-to-theory feedback loop that allows engineering to make sure the world continues to spin property. I don't think he compromised his design: I think he made it better by taking reality into account


    I would like to add just one argument in favour of records mapped to Data as WeakReferences in a regular HashMap, against Data mapped to records in a WeakHashMap : avoiding the WeakHashMap overhead. As the latter stores its keys as WeakReferences, it needs an internal Entry class extending WeakReference, mainly to override equals() and hashCode(). While when Data are stored in the map as values, the "basic" WeakReference is enough, the default Object equals() (which is faster) being perfect for our purpose.

    First, this harkens back to the efficiency issue, which we've establish is non material to the SCJD. Second, this violates the directive of using standard Java classes, even if a custom class is marginally faster. Finally, there's actually an excellent reason why they did this(the overloading), as elegantly explained in Joshua Brock's 'Effective Java'.

    Thank you for all your comments, Max, there were very interesting to me.
    All best,
    Phil.
    [ September 12, 2003: Message edited by: Philippe Maquet ]

    No problem. I try to set aside about an hour an day for the 'Ranch, though today it's been closer to two hours, so my apologies if I don't get to any other topics today.
    All best,
    M
    [ September 12, 2003: Message edited by: Max Habibi ]
    John Smith
    Ranch Hand

    Joined: Oct 08, 2001
    Posts: 2937
    PM: As the latter [the Data->Record map, -- my insert] stores its keys as WeakReferences, it needs an internal Entry class extending WeakReference, mainly to override equals() and hashCode().
    Actually, the internal Entry class in the WeakHashMap class extends Map.Entry, not WeakReference. You were probably thinking of WeakKey that does extends WeakReference, but then again, you don't need to override either the equals() method of WeakKey or the equals() method of Data. The equals() method of WeakKey simply calls the equal method of Data, and if the Data method is not overriden, it will test for object identity, which is exactly what should be expected.
    In fact, overriding the equals() and the hashCode() methods of Data may be lethal (in terms of you hoping to achive the automatic GC of weakly referenced objects) if you are planning to use it as a key in the Data->Record map.
    Note that javadoc for WeakHashMap specifically warns of that:
    With such recreatable key objects, however, the automatic removal of WeakHashMap entries whose keys have been discarded may prove to be confusing.
    All the subtleties of WeakHashMap aside, I don't like the idea of instance of Data being used either as a key or the value in the locks collection. But we've beaten that topic to death before.
    [ September 12, 2003: Message edited by: Eugene Kononov ]
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Hi Eugene,
    Actually, the internal Entry class in the WeakHashMap class extends Map.Entry, not WeakReference.

    Sorry, but here is how Entry is defined in WeakHashMap.java :

    Best,
    Phil.
    John Smith
    Ranch Hand

    Joined: Oct 08, 2001
    Posts: 2937
    Sorry, but here is how Entry is defined in WeakHashMap.java :
    private static class Entry extends WeakReference implements Map.Entry

    Here is my version (from JDK1.3):
    static private class Entry implements Map.Entry
    Yours is probably form JDK1.4.
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    <post topic="Map performance" relevance="tangential">
    Hola, Max!
    [Max]: My point is that when you talking about a project designed to run consistently on different OS systems and/or different JVMs, it's important to speak strictly, especially when you're showcasing just what a through and competent and careful software engineer you are. Just my opinion
    And my point later on was that it's easy to be more specific on this issue by focusing on one of the standard concrete Map implementations, which do offer better guarantees. In my own posts on this topic, I've been specifically talking about HashMap and TreeMap.
    [Jim]: ...and common sense is that constainsKey() should be the same. Neither implementation says anything more about containsValue()'s performance (though TreeMap inherits the API comment from Map here, probably linear time). ...snip.. we can be pretty sure containsKey() will offer better performance for these two classes at least.
    [Max]: While this is true, I wouldn't advise using 'common sense' and 'probably' as elements in your design choices document. Again, just my advice.

    Well yeah, I also won't say things like "dunno" and "yeah". But I'm not writing the design doc right here and now, I'm posting in the Big Moose Saloon.
    [Max]: If you can't make a statement about the nature of the APIs that's guaranteed to be platform independent, then I suggest that you not make it.
    Like, say, about FileChannel? In the case of HashSet, since they neglected to offer an O(1) guarantee for containsKey(), even though it's blindingly obvious that that's the behavior we should expect, I suppose that we can just ignore the containsKey() method and replace it with (map.get(key) != null). (As long as you never insert null values into the Map, which is an easy design choice to make.) There, we now have nice guaranteed behavior for HashMap. We already had an O(log(n)) guarantee for TreeMap, without having to resort to such silliness.
    So: we have an O(1) guarantee for HashMap's get(). We have O(log(n)) for TreeMap's containsKey(). For containsValue(), any implementation, we have no guarantee, but it's "probably" O(n). (Sorry for unprofessional use of the word "probably", but it's from the API after all.) And in reality, it is O(n) for HashMap and TreeMap. So OK, there's a theoretical possibility that on some JDKs/platforms/alternate realities the containsValue() might turn out faster than containsKey(). Best case scenio, we might get lucky somehow. But there's no denying that for HashMap and TreeMap, get() and containsKey() offer better guarantees for wort-case (and typical) behavior. Heck, containsValue() could legally turn out to be O(n^10) for all we know - containsKey() is still O(1), or O(log(n)). Which would you choose?
    [Jim]: As for different platforms - well, HashMap and TreeMap are just using non-native code really, using standard Sun SDKs. The behavior might be different on other JDKs maybe, but not other platforms, IMO.
    [Max]: Why not? two implementations of the same JDK on two different platforms are simply two different programs. I'm not aware that Sun guarantees that the same algorithm will be used in both cases.

    Note that I didn't use "guarantee" in that paragraph; I said IMO, and was speaking more loosely. What I meant was that if I run J2SDK 1.4.2 on a Windows box, and on a Sun box, and on a Linux box, they all use the exact same rt.jar file, which has the exact same non-native Java code. I think. Guess I'm not 100% sure of this, and I can't easily check it right now. But I've always understood that the differences between these versions were only in native methods and in the JVM infrastructre, not in the rt.jar shared library. OTOH, if you switch to a different JDK version, this goes out the window - rt.jar will be different.
    Regardless, even if there's no guarantee that the algorithms will be the same on different JDKs or platforms - Sun does offer the performance guarantees for HashMap and TreeMap that I've been talking about. Alternate platform implementations can implement however they want, but they're required to give O(1) performance for HashMap's get(), and O(log(n) for TreeMap's containsKey().
    Which is why Philippe made the statement: Map.containsValue() is less efficient than Map.containsKey(). That's speaking a bit loosely - if you want we can say "HashMap's get() and TreeMap's containsKey() offer better performance guarantees than Map's containsValue()." But personally I think the original statement was quite clear enough for us to fill in the blanks.
    BTW, Philippe - I had the exact same reactions as Max reagarding your use of language. I wish I spoke French as well as you speak English, and I think you worry too much about your alleged deficiencies here. E.g. I had no problem figuring out what "counterperformant" meant. You're doing great; stop apologizing.
    </post>
    [ September 12, 2003: Message edited by: Jim Yingst ]
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Hi Jim,
    BTW, Philippe - I had the exact same reactions as Max reagarding your use of language. I wish I spoke French as well as you speak English, and I think you worry too much about your alleged deficiencies here. E.g. I had no problem figuring out what "counterperformant" meant. You're doing great; stop apologizing.

    Thank you for that comment. After three months passed on this forum, I noticed that I improved both my Java experience and my english practice (for writing, I need less my dictionary than before). It's just when I see that I've been misunderstood that I wonder whether it's due to a language issue ... somethink like the feeling that if I could have explained it in french, it would have been easier to understand. Anyway, even you are misunderstood sometimes ... So I think you're right : I should loose that language complex. Promised !
    Best,
    Phil.
    [ September 14, 2003: Message edited by: Philippe Maquet ]
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    Sorry Philippe, I didn't understand that. Could you repeat it more clearly please?
    Philippe Maquet
    Bartender

    Joined: Jun 02, 2003
    Posts: 1872
    Sorry Jim, but rereading me, I didn't understand my own english ! So I couldn't explain it, even in french ... It was quite vague indeed !
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Jim,
    I'm not sure how TreeMaps got into this discussion, but let me see if I can boil all of this down. Performance is irrelevant to the SCJD, and we can't make any definitive statements about the performance benefits of checking to see if a HashMap contains a value versus it containing a key.
    Regarding FileChannels. I think we'll just have to agree to disagree on that one. I don't think I have to capacity to communicate my points any more clearly then I did on that issue. If I haven't been convincing, then I'll have to chalk it up to my need to become a better communicator.
    All best,
    M
    [ September 15, 2003: Message edited by: Max Habibi ]
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Originally posted by Philippe Maquet:
    Sorry Jim, but rereading me, I didn't understand my own english ! So I couldn't explain it, even in french ... It was quite vague indeed !

    I think Jim was just teasing you
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    I'm not sure how TreeMaps got into this discussion
    Convenient example of a Map with an explicit performance guarantee. You and Philippe had made general comments about Map; I refernced specific implementations. We can stick to HashMap though.
    but let me see if I can boil all of this down. Performance is irrelevant to the SCJD,/[b]
    Ok.
    [b]and we can't make any definitive statements about the performance benefits of checking to see if a HashMap contains a value versus it containing a key.

    Can, did. Referring to both minimum guaranteed performance, and to de facto actual performace, but not to what might happen in alternate universes. If you explicitly say containsKey() rather than "containing a key" then I'll agree with you (thanks to another careless oversight in API writing), but as I can use get() to test the same thing with guaranteed O91) performance, I'm sticking with my answer; don't see a refutation. The issue is ultimately not that important though, so agreeing to disagree here is fine too.
    Regarding FileChannels. I think we'll just have to agree to disagree on that one.
    Ok. Didn't mean to renew the topic per se, I just found it interesting that our roles this time are to some extent reversed. Last time I was the one saying "the API doesn't make that guarantee" and you were saying "yes it does". Just a funny coincidence; no deeper meaning.
    I think Jim was just teasing you
    Yup.
    [ September 15, 2003: Message edited by: Jim Yingst ]
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Hi Jim,
    Originally posted by Jim Yingst:
    and we can't make any definitive statements about the performance benefits of checking to see if a HashMap contains a value versus it containing a key.
    Can, did. Referring to both minimum guaranteed performance, and to de facto actual performace, but not to what might happen in alternate universes. If you explicitly say containsKey() rather than "containing a key" then I'll agree with you (thanks to another careless oversight in API writing), but as I can use get() to test the same thing with guaranteed O91) performance, I'm sticking with my answer; don't see a refutation. The issue is ultimately not that important though, so agreeing to disagree here is fine too.


    Phil's point, which I think you supported, was that containsKeys was faster the containsValue. We've established that this statement isn't supported by the Spec, even if it happens to the true on most(maybe even all) implementations.
    Now, if your point has become that using hashMap.get is faster than hashmap.containsValue, then you're undermining the original point that using a record-key mapping is faster. To wit,

    is just as efficient as

    thus, while your point regarding get is important, it's irrelevant .
    The discussion was originally about one mapping style(record-Data) being more efficient (which I think I just established isn't true), and/or the discussion was about was about the merits of containsKey vs. containsValue(which merits we can't establish either, strictly speaking :which is the only we should be speaking). AFIK, the discussion was never about TreeMaps, map.get, or alternative universes
    All best,
    M
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    Ummm, I can't tell what either of your code samples are supposed to do, Max. Doesn't contains() return a boolean? Doesn't equals() expect an Object? I can't determine how relevant get() is without a better example.
    AFIK, the discussion was never about TreeMaps, map.get, or alternative universes
    Well for our universe, for any existing HashMap implementation (or any Sun will ever release, I'll bet), containsKey() is more efficient than containsValue(). It's a pity they neglected to document that fact in the API for HashMap. But to assert that we shouldn't state that HashMap containsKey() is more efficient than containsValue() - well, it is. Documented or not. You're apparently ignoring the very purpose of Maps - to look thing up by key. Looking up by value is an afterthought.
    OK, gotta take a deep breath now. Hey Max, I don't suppose you wanna chime in on this thread? You know, if you get bored or something. Cheers...
    [ September 15, 2003: Message edited by: Jim Yingst ]
    Max Habibi
    town drunk
    ( and author)
    Sheriff

    Joined: Jun 27, 2002
    Posts: 4118
    Originally posted by Jim Yingst:
    Ummm, I can't tell what either of your code samples are supposed to do, Max. Doesn't contains() return a boolean? Doesn't equals() expect an Object? I can't determine how relevant get() is without a better example.

    Oops, substitute get for contains. And equals expects an object, per the Object.equals method.
    M

    AFIK, the discussion was never about TreeMaps, map.get, or alternative universes
    Well for our universe, for any existing HashMap implementation (or any Sun will ever release, I'll bet), containsKey() is more efficient than containsValue(). It's a pity they neglected to document that fact in the API for HashMap.

    I'm guessing they didn't document it because they wanted to be able to change it in the future: perhaps by making containsValue more efficient.

    But to assert that we shouldn't state that HashMap containsKey() is more efficient than containsValue() - well, it is. Documented or not. You're apparently ignoring the very purpose of Maps - to look thing up by key. Looking up by value is an afterthought.

    Jim, this is just nonesense. If it's not true per spec, then you shouldn't assume it true across the board. For example, it would be easy enough to keep a reverse map in values to keys(with a few tricks: not too many), that would reduce the time of containsValue to be comparable to containsKey. Are you willing to bet your certification fee that it isn't? For that matter, are you willing to bet the certification fee of the people you're advising?

    OK, gotta take a deep breath now. Hey Max, I don't suppose you wanna chime in on this thread? You know, if you get bored or something. Cheers...
    [ September 15, 2003: Message edited by: Jim Yingst ]

    heh. You couldn't get me near that crap for all the beer in England. I'm suprised Joe's so involved
    M
    [ September 15, 2003: Message edited by: Max Habibi ]
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    Oops, substitute get for contains.
    OK, so we are indeed talking about HashMap's get(i), which is guaranteed to be O(1), rather than containsKey(), which is not guaranteed, even though for all practical purposes it is in fact O(1). Good, that makes things easier.
    [Jim]: Well for our universe, for any existing HashMap implementation (or any Sun will ever release, I'll bet), containsKey() is more efficient than containsValue(). It's a pity they neglected to document that fact in the API for HashMap.
    [Max]: I'm guessing they didn't document it because they wanted to be able to change it in the future: perhaps by making containsValue more efficient.

    Ummm... OK. I was talking about the lack of documentation for the fact that HashMap's containsKey() is O(1). Fortunatelly, this is no longer relevant, because now I can concentrate on the fact that HashMap's get() is documented to be O(1). Which is what I wanted.
    Jim, this is just nonesense. If it's not true per spec, then you shouldn't assume it true across the board.
    Mmmm, irony.
    Oops, I guess I shouldn't have brought up the generic Map again, since it's more productive for me here to talk HashMaps (as I did originally). OK, I retract my last couple statements about the purpose of Maps, and containsValue() being an afterthought. It's generally true, but yeah, there can be exceptions at the Map level. I still stand by my assertion that for any HashMap implementation in existence, and any Sun is likely to release ever, containsKey() is (on average, for any significant value of N) faster than containsValue(). Note that I do not include classes that extend from HashMap in this statement - just HashMap's implementation. (I've been talking about HashMap's get() vs. MashMap's containsValue(), after all.)
    For example, it would be easy enough to keep a reverse map in values to keys(with a few tricks: not too many), that would reduce the time of containsValue to be comparable to containsKey.
    Agreed, they can be made equal. I've made a DoubleMap myself, and later seen similar stuff in one or more open-source libraries. So that's an exception for Map, but not HashMap.
    Are you willing to bet your certification fee that it isn't? For that matter, are you willing to bet the certification fee of the people you're advising?
    For Map: no, I misspoke. For HashMap (from the first half of the paragraph you said is "just nonsense": yes, if there were something to be gained if I win (e.g. if you'd but against me), and if we could agree on a suitable means of resolving the issue. I suppose my assertion about future releases from Sun is unresolvable unless we were to limit the scope - say, up through JDK 1.5 when it's released?
    Obviously, I can't bet the money