This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes Beginning Java and the fly likes Nested Set? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Nested Set?" Watch "Nested Set?" New topic
Author

Nested Set?

S Chan
Ranch Hand

Joined: Jul 25, 2011
Posts: 51
In mathematics, I can have a set within another set, i.e. nested sets, like this: {{a, b, c, d}, {e, f }}

I want to implement this in Java, so I thought I could just use java.util.Set like this:


For no specific reason, I have picked TreeSet as the solid class of Set :


That should compile alright, but when I run it, I have found a big problem with this design.

If you put instances of MyObject into a TreeSet<MyObject>, MyObject has to implement the Comparable<MyObject> interface in order to put MyObject instances into TreeSet<MyObject>. That's alright for me - I can just implement that interface without sweat.

However, for TreeSet<TreeSet<MyObject>> , I would run into trouble because TreeSet itself does not implement the Comparable interface!

Help! What should I do to model the mathematics set like {{a, b, c, d}, {e, f }} ? I don't care if it's a TreeSet or whatever other solid implementation of Set. I just want something simple to model the properties of a mathematics set - no duplicates, order is not important, may be empty, etc...

Thank you in advance for help
dennis deems
Ranch Hand

Joined: Mar 12, 2011
Posts: 808
HashSet will do what you want.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

You don't want ordering in your set? Then why did you choose TreeSet instead of HashSet?
S Chan
Ranch Hand

Joined: Jul 25, 2011
Posts: 51
Paul Clapham wrote:You don't want ordering in your set? Then why did you choose TreeSet instead of HashSet?


I thought TreeSet only orders its elements for internal use(storage and quick accessing). For users of TreeSet like me, we only need to think the elements are in a set - unordered. Right?
S Chan
Ranch Hand

Joined: Jul 25, 2011
Posts: 51
Dennis Deems wrote:HashSet will do what you want.


Will have a read of the javadocs, thanks
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7552
    
  18

S Chan wrote:I thought TreeSet only orders its elements for internal use(storage and quick accessing). For users of TreeSet like me, we only need to think the elements are in a set - unordered. Right?

Right, especially as TreeSet is not itself Comparable.

In fact, using a Set<Set<MyObject>> of any kind is potentially dangerous, because the inner Set will be the key to the outer one, and the specifications for both Set.equals() and Set.hashCode() state that they should be coupled to the Set's contents.

You can get away with it, providing either
(a) each inner Set is added with Collections.unmodifiableSet().
or
(b) any update to an inner Set replaces it, ie: remove(inner) --> modify contents of 'inner' --> add(inner).

Another alternative would be a List<Set<MyObject>>, but then you'd have to impose the Set constraint on the outer "set" yourself, and that could be an expensive check.

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
S Chan
Ranch Hand

Joined: Jul 25, 2011
Posts: 51
Thanks for the advice, but that solution may be a little troublesome in practice.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7552
    
  18

S Chan wrote:Thanks for the advice, but that solution may be a little troublesome in practice.

Erm...which one? I suggested a few. Or were you not talking to me?

Winston
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3003
    
    9
S Chan wrote:I thought TreeSet only orders its elements for internal use(storage and quick accessing). For users of TreeSet like me, we only need to think the elements are in a set - unordered. Right?

Well, TreeSet sorts its contents in a way that anyone can see, if they iterate through the contents. However you're also free to ignore this if you want. But there is little point in using TreeSet if you don't need the contents to be sorted, since TreeSet is slower than HashSet (because it takes some extra time to sort), and because TreeSet imposes the extra requirement of implementing Comparable or Comparator in order to work.
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3003
    
    9
Winston Gutkowski wrote:
In fact, using a Set<Set<MyObject>> of any kind is potentially dangerous, because the inner Set will be the key to the outer one, and the specifications for both Set.equals() and Set.hashCode() state that they should be coupled to the Set's contents.

Yes. This is an example of a more general rule: all standard Set and Map implementations assume that you will not change an object after it's been used as a key in a Map, or inserted into a Set. If you change the object in any way (short of removing it completely), then you will most likely break the Map or Set. You won't be able to rely on the results. This is true for HashSet, HashMap, TreeSet, and TreeMap. And any other standard implementations I can think of.

Winston Gutkowski wrote:You can get away with it, providing either
(a) each inner Set is added with Collections.unmodifiableSet().
or
(b) any update to an inner Set replaces it, ie: remove(inner) --> modify contents of 'inner' --> add(inner).

To be extra clear here for S Chan: in the code you gave:

(let's pretend all those TreeSets are HashSets instead)

The problem here is that those innerSets have been put into the myNestedSet with nothing in them. That's fine, if the plan is for them to never have anything in them. But if you plan to put anything in them later, it's very bad. You need to put contents into the innerSets before you put them into the myNestedSet. E.g.:


Using Collections.unmodifiableSet() as Winston suggests could help make sure you don't accidentally try to modify a Set after it's been added to the outer Set.

This could also be expressed using Google Guava:
S Chan
Ranch Hand

Joined: Jul 25, 2011
Posts: 51
Winston Gutkowski wrote:
S Chan wrote:Thanks for the advice, but that solution may be a little troublesome in practice.

Erm...which one? I suggested a few. Or were you not talking to me?

Winston

I was talking about
remove(inner) --> modify contents of 'inner' --> add(inner)

It's workable, but I was just thinking for the case of Set<Set<MyObject>>. If I were to modify the something on MyObject level, I need to:
1 - remove the Set<MyObject> element from the Set<Set<MyObject>> collection;
2 - remove the MyObject element from Set<MyObject> and modify the element;
3 - add the element back to the extracted Set<MyObject> collection;
4 - stuff that collection back into the top level Set<Set<MyObject>>.

Is this logic right?
S Chan
Ranch Hand

Joined: Jul 25, 2011
Posts: 51
Thanks for all suggestions. I have surely learnt a lot! I am working to change it from TreeSet to HashSet.

Meanwhile, a thought raises - Let's say I am now using HashSet. If I were to add/remove/modify the elements after the Set<Set<MyObject>> is initialized, is there a better/efficient way of doing it?

If I were to lower my requirement a little - let's say I will handle the element uniqueness myself and not having the collections to handle that requirement organically, is List<Set<MyObject>> or even List<List<MyObject>> really a faster/efficient alternative? I am talking about being quick in storing/accessing/modifying after the collection is initialized.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7552
    
  18

S Chan wrote:Is this logic right?

Basically, yes; especially if the Set type either (a) uses a hashcode, or (b) implements SortedSet (which pretty much all the common ones do).

Note that this is NOT true of a List, which doesn't
(a) impose any ordering on it's elements (unless you sort it).
(b) restrict duplicates (unless you check for them yourself before adding/changing an element).
so you're free to change elements as you see fit; however, in the process, you will destroy any ordering you might have previously had.

And although it seems a bit of a mouthful, the three operations (remove, change and add) are all quite quick, especially if the collection is hashed.

Note that when your inner elements are themselves collections, you also have the possibility of a change that results in a "collision".
For example, if you had:
{ {a, b, c}, {a, b, d} ... }
a change to the second set to replace 'd' with 'c' now creates a duplicate, which is theoretically illegal.

That's why you have to remove before changing (it could be done without it, but the logic is likely to be even more tortuous).

Winston
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7552
    
  18

S Chan wrote:If I were to lower my requirement a little - let's say I will handle the element uniqueness myself and not having the collections to handle that requirement organically, is List<Set<MyObject>> or even List<List<MyObject>> really a faster/efficient alternative? I am talking about being quick in storing/accessing/modifying after the collection is initialized.

Unfortunately, the answer to that question depends on a lot of external factors such as size and usage. Checking a List of 50 elements for duplicates is likely to be pretty quick, 5,000 not so quick, and 5,000,000 darn slow; with a HashSet, it's pretty well instantaneous no matter how big the Set is.

I guess my other question would be: how do you plan on identifying your inner Sets? Set doesn't have a 'get()' method, only a contains(), so once you've added an element to a Set, the only way you can get it back is to iterate through it until you find the element you want. A HashMap, on the other hand, does have getKey() and getValue() methods, so if you made them both the same (ie, HashMap<MyObject, MyObject>) you'd have a Map that behaves pretty much like a Set.

I think this is getting a bit hairy though. My suggestion would be to back up and think about what your requirements really are:
Do you really need to order your inner Sets? If so, how often do you need to do it?
Are duplicates (and, by extension therefore, "collisions") significant?
How big are these collections (inner and outer)?

Answering those questions might help you towards a solution; and if you have any further queries, feel free to come back.

Winston
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38075
    
  22
Winston Gutkowski wrote: . . . Note that this is NOT true of a List, which doesn't
(a) impose any ordering on it's elements (unless you sort it). . . .
That isn’t really clear. What you mean is that a List (also called a Sequence) by default stores its elements in the order they were inserted, until you sort the List.
You can have Sets which do not support any particular order, or Sets which sort on insertion (eg TreeSet) or Sets which record the insertion order (eg LinkedHashSet, which is a combination of a Set and a List). A mathematician will tell you that a set does not support ordering or sorting at all, whereas sequences both support insertion order and duplicates.

You have several times correctly pointed out the hazard of inserting mutable objects into a Set. If you change an element in a Set, not only may you create duplicates, but also you can change the hash code or behaviour of the comparison methods, so it becomes impossible to find the elements again. Try this:You can get the same sort of errors with hash-based Maps, and also with Trees.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7552
    
  18

Campbell Ritchie wrote:You can have Sets which do not support any particular order, or Sets which sort on insertion (eg TreeSet) or Sets which record the insertion order (eg LinkedHashSet, which is a combination of a Set and a List). A mathematician will tell you that a set does not support ordering or sorting at all, whereas sequences both support insertion order and duplicates...

Fair enough; however, most of the common implementations of Set impose either an order or some kind of hashed storage of elements; neither of which is conducive to element change.

Winston
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38075
    
  22
Agree: the lesson is about the peculiar behaviour of mutable objects in Sets and other hashed collections.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Nested Set?
 
Similar Threads
K&B chapter 7 question 15. Collections.
Threads and Synchronization examples
Various questions regarding OCPJP
Dan Exam Q on Collections Doubt?
code