Meaningless Drivel is fun!*
The moose likes Performance and the fly likes Initial Capacity of Collections Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "Initial Capacity of Collections" Watch "Initial Capacity of Collections" New topic
Author

Initial Capacity of Collections

Ricardo Cortes
Ranch Hand

Joined: Jan 23, 2002
Posts: 140
I'm running the Analyze Code feature of IntelliJ on my classes and one of the suggestions it gave me was to add initial capacities to my collections (ArrayLists, HashMaps, etc). Page 101 of "Java Performance Tuning" states the following:

Vector grows by creating a new larger internal array object, copying all the elements from the old array, and discarding it...presizing a collection to its largest potential size reduces the number of objects discarded

So, what happens if I don't know what the size of the collection will be? Perhaps I'm creating an ArrayList from a datbase call. Should I initialize the collection to some arbitraryily large number or should I simply use an empty ArrayList. Any input on this subject would be great as I'm attempting to come up with a performance methodology for my company.


Sun Certified J2EE Architect for the J2EE Platform (Part 1)<br />Sun Certified Web Component Developer for the J2EE Platform<br />Sun Certified Programmer for the Java 2 Platform
David Harkness
Ranch Hand

Joined: Aug 07, 2003
Posts: 1646
I presize Collections and StringBuffers in two cases. The first is the obvious one: when you know exactly how many elements it will hold. The second is when I expect a "large" number of elements.

Resizing an ArrayList once or twice isn't going to cost you that much, but if you're executing a query that you know for certain (by analyzing the data or knowing the domain well) will return at least 300 elements, on average 500, and at most 1000, you may as well presize it at 500 at least to avoid ten resizes. ArrayList grows by half each resize rather than Vector's doubling.

The key to remember is that wasting 500 empty reference slots is only 2k. On today's hardware that doesn't even count as lint. I would prefer that to 10 extra allocations and 10 + 15 + 23 + 35 + 53 + 80 + ... + ~500 unnecessary copying of references.

As my programs rarely require Maps larger than 10 or 20 elements, I've never bothered to presize those. Be sure that you really understand how HashMap works, enough to be able to choose values that will provide good distribution. If not, you might do more harm than good.
Ricardo Cortes
Ranch Hand

Joined: Jan 23, 2002
Posts: 140
Very good information...thanks!
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
I didn't yet care about it, and as far as I can tell wasn't hurt by it yet - the performance bottlenecks always proved to be elsewhere.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
steve souza
Ranch Hand

Joined: Jun 26, 2002
Posts: 861
I agree with Ilja. With few exceptions perfromance bottlenecks in my apps tend to be IO (database, network, file). I read a recent posting where someone put 1,000,000 entries into a HashSet in about .25 seconds.

Monitoring the actual performance of your code is the way to go. Don't spend much time tuning until you have measurements. People that tune first usually tune code that makes no substantive difference to application performance. For example you could write a few lines of code that tested the impact of various initial sizes of ArrayList and see how/if they affect performance.
[ May 01, 2005: Message edited by: steve souza ]

http://www.jamonapi.com/ - a fast, free open source performance tuning api.
JavaRanch Performance FAQ
 
Don't get me started about those stupid light bulbs.
 
subject: Initial Capacity of Collections