permaculture playing cards*
The moose likes Performance and the fly likes Enumeration vs Iterator Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Java » Performance
Bookmark "Enumeration vs Iterator" Watch "Enumeration vs Iterator" New topic
Author

Enumeration vs Iterator

Vad Fogel
Ranch Hand

Joined: Aug 25, 2003
Posts: 504
Is it right that Enumeration on average performs about 50% faster than Iterator for sequential access of the collection elements?
Thanks.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24183
    
  34

Enumeration and Iterator are Interfaces; for the most part the only difference is in the names of the methods. It would be possible to implement them both identically. Therefore no, it's not true. In the general case, there's no difference whatsoever.
Now, in the context of one specific container, which will have its own implementations of Enumeration and Iterator, there may indeed be differences in these specific implementations.


[Jess in Action][AskingGoodQuestions]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Well, Iterator actually adds one method that Enumeration doesn't has: remove(). I can't think of this as inherently leading to a performance difference, though.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Vad Fogel
Ranch Hand

Joined: Aug 25, 2003
Posts: 504
Thanks for your responses, Ernest and Ilja!
Before I asked this question, I'd tried using Iterator and Enumeration to compare their performance on a Vector object containing 100 000 Strings. Enumeration was consistently about 50% faster. I'm using Win XP with SDk 1.4.2.
Thanks.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
What test code are you using, Vad?
One thing about Iterator is that it's expected to be fail-fast - throwing ConcurrentModificationException if it detects that the underlying collection is modified while iteration is underway. This extra checking may take some more processor time - but it's often worth it. If you do accidentally create a situation where a collection is being modified while being iterated over, it might be difficult to diagnose otherwise. You might get NullPointerException or ArrayIndexOutOufBoundException instead, which might make you think the fault is with the collection code - when in fact it's your fault for allowing concurrent modification.
Also in general for a Vector or ArrayList (or for any RandomAccess implementation) it's faster to loop with a for loop that directly accesses elements using an index:

Of course, if you get a NoSuchElementException from this code, it's probably because you allowed some concurrent modification from another thread. The price you pay for speed is that you have to diagnose that yourself.
[ February 16, 2004: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Vad Fogel
Ranch Hand

Joined: Aug 25, 2003
Posts: 504
Jim, I used something rather simple for my test:

The average performance for Iterator in ms was about 140, whereas for Enumeration it was 110. Now, if you've noticed, the iterator comes first, which means it can (and will) enjoy the fresh memory resources, and that should screw up the Enumeration performance a bit (GC may not run always).
When you swap the blocks, the performance difference stands out more distinctly: 80 ms for enumeration and 180 ms for iterator. If tested separately, the blocks yield the following averages: 80 ms for enumeration, 135 ms for iterator.
Thanks.
[ February 16, 2004: Message edited by: Vad Fogel ]
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
Looking in the actual source code - we see that iterator for Vector uses the AbstractList method - it returns an instance of a private class inside AbstractList - looking at the next method we see:

The Vector Enumeration on the other hand has synchronization but is otherwise simpler.
Looks like that checkForComodification call is what introduces the extra time.
Bill
Kirk Pepperdine
Author
Ranch Hand

Joined: Jan 17, 2001
Posts: 71
Humm, the problem here is that the benchmark doesn't consider the effects of GC, OSR and a number of other issues (running MBM in main). Try running the benchmark several times and then calculate the variance. If it's quite aways from being a non-zero value(which is will be), then something is interfering with the test. In other words, I see no reason for my being able to trust your results. In fact, I see every reason to doubt them.
But, it's easy to be critical and not offer any solutions so, instead of looking mean, I'd like to appear to be helpful.
First, the interface issue is quite valid so the tests results cannot be extrapolated to other collections. Here are some tips for doing the MBM
1) put the test into it's own (non-static) method.
2) call that method at least once to pre-heat the jit'ed method cache.
3) call System.gc() and then Thread.sleep(500 or 1000ms) to let the system settledown before taking the measurement.
4) call the test a number of times.. use statistics (standard deviation, max, min, median) to tell you when you have a good run. Remember, you are looking for as close to a non-zero variance in the results as is useful.
5) run the test with the -verbose:gc flag... this will tell you if you've eliminated GC as a source of noise. If not consider techniques to eliminate object creation/gc from the test. These include precreating and holding onto objects in an array. The array can be released outside of the timed loop. One can also -Xms and -Xmx to increase the size of heap. Make a call to system.gc between tests. don't forget to sleep to allow the system to settledown after calling for a GC. Do no use test results with the -verbose:gc flag set as these number will be effected by the flag
6) do the test for iterators
7) repeat the test for enumerations.
8) report average, sample size, and variance (max, min and median would also be nice).
HTH,
Kirk


Author of <a href="http://www.amazon.com/exec/obidos/ASIN/0672324261/ref=jranch-20" target="_blank" rel="nofollow">Ant Developer's Handbook</a>
Dark Vachor
Greenhorn

Joined: Oct 13, 2004
Posts: 14
Sorry guys

I ran my own tests which show that Enumeration is nearly twice faster than Iterator, and takes far less memory, enabling the GC to sleep more ;-)
I'm not really surprised of it, since Enumeration is quite basic and fits basic needs, while Iterator tries to see if there are concurrent modifications or so, and consequently takes more power.
SUN recommends usage of Iterator beacause :
- They sell hardware
- There are so many poor developers out there

Kind regards
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Dark Vachor:
Iterator tries to see if there are concurrent modifications or so, and consequently takes more power.


Good point.


SUN recommends usage of Iterator beacause :
- They sell hardware
- There are so many poor developers out there


Multithreading problems are really hard to find, so I am willing to use all the help I can get. I think of that as rather professional behaviour. YMMV.
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923

Running my modified version of the benchmark, I got similar results - 50% time earning by enumeration with 100,000 Elements.
Then I went for 1,000,000 Elements and got nearly no difference between Enumeration and Iterator.
Tried again with 100,000 - no difference too.
1,000 Elements - no difference too. (server - client/ 1.4 - 1.5).
Made a minor modifikation in the source, - no difference too.
I cannot reproduce the difference.
Perhaps tomorrow, after rebooting.

Conclusion: If you have a java-program - run it twice?


http://home.arcor.de/hirnstrom/bewerbung
Dark Vachor
Greenhorn

Joined: Oct 13, 2004
Posts: 14

Multithreading problems are really hard to find, so I am willing to use all the help I can get. I think of that as rather professional behaviour


I think professional attitude is determining the best thing to do according to the risks of wrong usage of your classes. I have a very strong experience in multi-threading, and have no problem with it. That's probably while I feel tuning for performance is best ;-)


Conclusion: If you have a java-program - run it twice?


I did it, of course, I'm no greenhorn ;-) ... in reality

In fact you probably forgot the time to create the Iterator or Enumeration objects. Creating many Iterator or many Enumeration does cost the same !!!
Try a more intensive program like this :


(1000 collection of 4000 elements, loop 100 times on each => 4000000 elements, 400000000 element access)
Then just switch the Enumeration to an Iterator, and see by yourself :-)
Tip : use -Xmx256m at least.
My results :
Enumeration : 46852 ms
Iterator : 85546 ms

[edited the code so that it doesn't screw up the page layout because of overly long lines - Ilja]
[ October 14, 2004: Message edited by: Ilja Preuss ]
Dark Vachor
Greenhorn

Joined: Oct 13, 2004
Posts: 14
Sorry I meant "Creating many Iterator or many Enumeration does NOT cost the same !"

Bu the way, keep in mind that many people think Java is slow. And they are right because too many programmers don't care with programming for good performances. Same for memory footprint.
Multi-threading is likely to be useful. Performance is always appreciated, and sometime performance requirement is an iten of a contract.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Dark Vachor:
Bu the way, keep in mind that many people think Java is slow. And they are right because too many programmers don't care with programming for good performances.


What I've seen more often is developers wasting time doing micro-optimizations, often even making things worse that way.


Performance is always appreciated


And the way to get the best performance for the Customers money is to *not* optimize every part of the code. Instead, optimize for development time and use the saved time to find and remove bottlenecks.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Dark Vachor:
I think professional attitude is determining the best thing to do according to the risks of wrong usage of your classes. I have a very strong experience in multi-threading, and have no problem with it. That's probably while I feel tuning for performance is best.


Well, according to your data, the cost of using Iterator instead of Enumeration is roughly 0,0001 ms per element on your machine using your JDK. I'm confident that in most cases in real programs this won't make a difference at all.
Dark Vachor
Greenhorn

Joined: Oct 13, 2004
Posts: 14
Of course it is pretty fast. Do you mean Java is slow ? ;-)
If everyone doesn't care about doing stuff with means that are slow, programs are slow, and Java (!) is considered slow (they should say the JVM...). Because programs are a matter of creating objects and calling methods, we should take care of each object and each call.

Back to original subject, my benchmark shows that Enumeration is nearly x2 faster than Iterator...
Dark Vachor
Greenhorn

Joined: Oct 13, 2004
Posts: 14
BTW Ilja, thank you for making the program I posted more readable with good line wrapping.

One more point on the subject, Enumeration is still here in J2SE 1.5, if it was to be replaced by Iterator in all cases, shouldn't it be declared deprecated ?
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
I imagine that, just like Vector, Enumeration is used in way too many places to even think of deprecating it.
Bill
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Dark Vachor:
If everyone doesn't care about doing stuff with means that are slow, programs are slow


Take a look at the dimensions of your loop!

If I reduce your code to 400.000 (!) iterations, it runs in <40ms on my machine. I think in many circumstances you wouldn't even notice that the program is spending time at all - even showing a simple output to the user is likely to take more time.


Because programs are a matter of creating objects and calling methods, we should take care of each object and each call.


Not of each, but of the important ones. It's the 90/10 rule at work: allmost every system is spending around 90% of its time in 10% of its code. That is, if you optimize the wrong 90%, you only get a 10% performance improvement, but if you find the bottlenecks, you get a significantly better performance for less effort. That is why profiling is so important before optimizing code.


Back to original subject, my benchmark shows that Enumeration is nearly x2 faster than Iterator...


1.5x on my sytem when using your code. And when you make your loop actually *do* something, it becomes even less.
Dark Vachor
Greenhorn

Joined: Oct 13, 2004
Posts: 14
I imagine that, just like Vector, Enumeration is used in way too many places to even think of deprecating it.

Enumeration is not that used... they could have deprecated it.
For example they deprecated Date:: getDate(), getDay(), getHours(), getMinutes(), and so on. This methods were often used in many programs !


Take a look at the dimensions of your loop!
If I reduce your code to 400.000 (!) iterations, it runs in <40ms on my machine. I think in many circumstances you wouldn't even notice that the program is spending time at all - even showing a simple output to the user is likely to take more time.

It's for testing. This way you see (when you keep the memory messages) the way the GC behaves.

Not of each, but of the important ones. It's the 90/10 rule at work: allmost every system is spending around 90% of its time in 10% of its code. That is, if you optimize the wrong 90%, you only get a 10% performance improvement, but if you find the bottlenecks, you get a significantly better performance for less effort. That is why profiling is so important before optimizing code.

Of course I agree.

1.5x on my sytem when using your code. And when you make your loop actually *do* something, it becomes even less.

The test does not refect reality, I think most people agree on that.
This is a test, and it tests only Enumeration vs Iterator.
Paul Horn
Greenhorn

Joined: Nov 03, 2004
Posts: 2
Am I missing something?

I've been looking for some information on the cost of creating variables inside large loops, instead of creating the variable outside the loop & then simply assigning it.

In other words, does it matter to speed or memory or both to do this:



rather than re-delcare the variable inside the loop:



I tried messing with Dark Vachor's code post above, moving Enumeration, int and Vector objects outside the main loop. So now the main loop looks like this:


Now that the loop is not doing anything but iterating, why does it still take up megabytes of memory? What's it allocating?

- Paul
[ November 04, 2004: Message edited by: Ilja Preuss ]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Paul Horn:
Now that the loop is not doing anything but iterating, why does it still take up megabytes of memory? What's it allocating?


See this discussion.
Paul Horn
Greenhorn

Joined: Nov 03, 2004
Posts: 2
Thanks for tidying up my post! :-)

And thanks for the pointer to the other discussion. That's what I thought I had learned during my research - until I was poking around with the subject of this discussion, Enumerator vs. Iterator. We recently had a programmer change a handful of interfaces from one to the other, and he claimed "better performance" for his rationale. I was using this discussion & the sample code to vet his theory.

However, when I ran the code example for timing differences (w/sun jdk 1.4.1_05 on NT), I noticed that it consistently gobbled up several megabytes of free memory during every execution; the "free memory" would drop steadily with each iteration, and end up significantly lower at the end.

Seeing that the loop is merely iterating and assigning, not explicitly allocating any new variables, why is the program continually allocating more and more memory? I thought maybe the reason was the local variable in/out of the loop, but that appears to make no difference (the memory consumed was exactly identical either way).

Any ideas? Am I blind and missing something obvious?

Thx -
- Paul
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
The statement l_v.elements() is creating a new Iterator object. And all those objects will take up heap memory until the VM decides to garbage collect them.
imaya Munusamy
Greenhorn

Joined: May 19, 2008
Posts: 20
The difference is mainly because of cost extra method calls checkForComodification() and get(). method calls has to do some work to set up stack frame and return value to caller

it you implement your own implementation avoiding this methos then it 'll be same as enumaration.

public Object next() {
//checkForComodification(); --comment

try {
//replace the get method with the get method implementation directly
//Object next = get(cursor);
/*****get method implementation start ******/
if (cursor >= elementCount)
throw new ArrayIndexOutOfBoundsException(cursor);

Object next = elementData[cursor];
/*****get method implementation end ******/

lastRet = cursor++;
return next;
} catch (IndexOutOfBoundsException e) {
checkForComodification();
throw new NoSuchElementException();
}
Daniel Breitner
Ranch Hand

Joined: Nov 13, 2008
Posts: 70
Has anyone of you tried to compare an Iterator vs. an Enumeration with removing Elements from a list (the only usecase I would recommend an iterator)?
In the enumeration case it would require to create a second list holding the non-removed Elements... sounds imperformant to me.


Visit me at http://liferay-blogging.blogspot.com
Anup Mulik
Greenhorn

Joined: Dec 18, 2010
Posts: 8
It's better to be safe rather than to be in trouble as below.



SCJP 5,SCWCD 5,SCBCD 5,SCDJWS in progress.....
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7718
    
  20

Anup Mulik wrote:It's better to be safe rather than to be in trouble as below.

I'm not quite sure what your code is meant to show (and BTW, please UseCodeTags (←click) when posting code).

Also: Since the last post to this thread (started in 2004) was in January 2011, I suspect Elvis has left the building.

Winston

Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

Anup Mulik wrote:

I believe this code would cause an OutOfMemmoryException, not an infinite loop. So, in this particular example there isn't much difference. But generally I agree that using Vector's iterator (which is documented to provide the co-modification check) is better that enumerator, because it helps catch possible bugs earlier.
 
 
subject: Enumeration vs Iterator