aspose file tools*
The moose likes Performance and the fly likes Nested loops or Set operations? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "Nested loops or Set operations?" Watch "Nested loops or Set operations?" New topic
Author

Nested loops or Set operations?

Robin Tropper
Greenhorn

Joined: Jun 21, 2012
Posts: 2
Is there any advantage to using Java's operations defined in sets or collections over inspecting in nested for-loops?

For example, suppose I want to keep items from one array that do not appear in another, I could do:


Or else I could do:


I personally see no improvement in legibility nor any reduction of error potential.....

Any comment or observation is appreciated.
Thanks!
Teo Filimon
Greenhorn

Joined: Aug 18, 2013
Posts: 14

I wrote a mini-article a while back about looping ArrayLists. Essentially it's a lot faster when you loop using a good old-fashion counter, like:

But i would only write it like that if i really care about speed. Otherwise, i'm not sure i understand your question or the 2 code samples, i'm not sure they're doing the same thing
Teo Filimon
Greenhorn

Joined: Aug 18, 2013
Posts: 14

At a second read, i think i understand your question. Normally it doesn't really matter which one you loop (array or arraylist - however making arraylists will prevent you from using primitives which might be costly with boxing/unboxing). There might be some nuances with sorted data structures (slower to add objects, faster to check if it contains certain elements) but it really depends on the case.
James Boswell
Bartender

Joined: Nov 09, 2011
Posts: 1021
    
    5

Teo

I would strongly contest your "a lot faster" claim about a loop using a counter. In nearly all cases, the performance difference will be negligible.

The foreach syntax is much more readable and less error prone.
Teo Filimon
Greenhorn

Joined: Aug 18, 2013
Posts: 14

2 to 3 times faster is not really negligible (but i only mean ArrayList). But if you care more about readability i would also suggest the foreach syntax.
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3014
    
  10
Addressing the original question, it's worth noting that if you look at a HashSet or TreeSet, a method like contains() can be MUCH faster than looping through an iterator or using a traditional for loop on an ArrayList or other structure. Depending on what operations you need to optimize, different data structures can offer different benefits. This can be much more significant than the iterator-vs-traditional-for-loop distinction, which (I agree with James) is often remarkably unimportant.
James Boswell
Bartender

Joined: Nov 09, 2011
Posts: 1021
    
    5

Teo Filimon wrote:2 to 3 times faster is not really negligible

Do you really mean 2 to 3 times faster? Or 2 to 3 milliseconds faster? Sorry to go off topic from the author's question but you shouldn't make a claim like this unless you have some figures to back it up.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Robin Tropper wrote:I personally see no improvement in legibility nor any reduction of error potential.....


Well, your second example avoids the error of comparing two objects using == instead of equals(), which error you committed in the first example. However the problem with the second example is that it doesn't go far enough in using the built-in methods. Here's a third example:



In this code baseElements and elementsToExclude can be any kind of Collection.

When you read the code it's reasonably obvious what it's doing (and notice that the variable naming helps with that).
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3014
    
  10
James Boswell wrote:
Teo Filimon wrote:2 to 3 times faster is not really negligible

Do you really mean 2 to 3 times faster? Or 2 to 3 milliseconds faster? Sorry to go off topic from the author's question but you shouldn't make a claim like this unless you have some figures to back it up.

Personally I've seen it as much as almost twice as fast. Not more. Of course, it all depends on what else you're doing inside the loop. If you're doing anything remotely complicated, it's very easy for the difference in these techniques to appear nonexistent. But it's possible to construct scenarios where you do something useful in the loop (enough to prevent the JVM from optimizing the whole thing away to nothing) and you do see a significant difference. Here are some sample test cases to run:
Teo Filimon
Greenhorn

Joined: Aug 18, 2013
Posts: 14

@James I have my own performance tests to back it up. I've seen that depending on the arraylist size the boost varies between 2x and 3x. Also, please check: http://developer.android.com/training/articles/perf-tips.html#Loops
James Boswell
Bartender

Joined: Nov 09, 2011
Posts: 1021
    
    5

Mike, thanks for the example. Nice to see you also took the time to indent the output

Teo, fair enough if you have your own results and some evidence to back it up. It is very easy for new developers to follow advice to the letter and someone might avoid the for-each syntax based upon comments like "2 to 3 times faster".
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

Mike Simmons wrote:Addressing the original question, it's worth noting that if you look at a HashSet or TreeSet, a method like contains() can be MUCH faster than looping through an iterator or using a traditional for loop on an ArrayList or other structure.

It really depends. For small sizes, iterating over a primitive array (not ArrayList) can be much faster than a HashSet. For larger sizes, binary search over a pre-sorted primitive array can still beat a HashSet.

Apparently ArrayList iterators is slowed down by the comodification checks. Interesting. In any case, a for-each loop didn't ever turned out to be a bottleneck in a profiler for me yet. But I'm only developing in Java for several years
Robin Tropper
Greenhorn

Joined: Jun 21, 2012
Posts: 2
AMAZING DISCUSSION - thank you all for your replies!
That more than answers my question - SUPER!
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7803
    
  21

Martin Vajsar wrote:It really depends. For small sizes, iterating over a primitive array (not ArrayList) can be much faster than a HashSet. For larger sizes, binary search over a pre-sorted primitive array can still beat a HashSet.

I guess the question is: how fast do you need it to be?

HashSet's offer O(1) contains() capability, as opposed to O(log(n)) for a TreeSet or binary search - the latter of which requires a sort() - so if I was doing it, I think I'd be looking at HashSet, simply because I reckon it would scale better.

@Robin: So, in answer to your original question: "Is there any advantage to using Java's operations defined in sets or collections over inspecting in nested for-loops?", the answer is: Yes.

And furthermore, a linked "Hashed" collection (LinkedHashSet/LinkedHashMap) might be the best of all. If List elements can't be duplicate, or if you're not worried about them, then I'd say that LinkedHashSet.removeAll() is what you want; but if you need to account for possible duplicates, then what about a:
LinkedHashMap<T, AtomicInteger>
where the AtomicInteger is the count of duplicate items in the source List<T>.

I'll leave you to work out how that might be useful.

Winston

Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Nested loops or Set operations?