aspose file tools*
The moose likes Performance and the fly likes List of lists performance question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "List of lists performance question" Watch "List of lists performance question" New topic
Author

List of lists performance question

Rj Meyers
Greenhorn

Joined: Aug 07, 2009
Posts: 3
(note: I had a problem posting this initially, so if this is a double-post I apologize and will try to remove the second one as soon as possible.)

First, let me say thank you to this entire forum. I've only just registered now, but I've been struggling to familiarize myself with Java for the past several months and the advice on this forum has by far been the most complete and easy to understand of any place I've found.

Onto my question: I am planning on implementing a large ArrayList container (referred to here as the Universe) that will hold other ArrayLists in each row. These rows will represent star systems, so element [0] is the parent star and every object after that in the row is either a planet or starship. The Universe could contain several thousand stars systems (rows), while each star will contain something like 5 or 10 planets (objects). However, some stars may end up containing several thousand objects if starships are present. I will want to swap these starships between stars (rows of the Universe) from time to time.

Now, I have a table of relative performance metrics of various container types from Bruce Eckel's "Thinking in Java" that shows ArrayLists being relatively quick when adding objects to their ends, but being very slow at large sizes when inserting objects at some intermediate location. If I take 1000 objects from one star system (row) of the Universe list and add them to the end of another star system (row) of the Universe, I will be adding on to that other star system's list--but will I also be inserting into the middle of the Universe? Will I incur the penalty for random insertion?

And if you have any recommendations for alternative structures, I'm all ears. What I like about this structure is that I could add moons to each planet, etc. by making a list of lists of lists and know that the first element of every container is the "parent" of the subsequent objects (a star is the first element of each star system list, a planet is the first element of each planet/moon system list, etc.).

Thank you.
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 30919
    
158

Rj,
No double post. (Sorry about that - we're on a temporary server this week and having some problems.)

I'm wondering if you really need a list or if a set will do the job. While planets have a logical order, starships don't seem to. I don't see a logical order for star systems either for that matter.

These rows will represent star systems, so element [0] is the parent star and every object after that in the row is either a planet or starship.

This actually strikes me as a design thing. You don't want elements of a list to have different meanings based on their position. That's a red flag that you really should have a data structure. For example what if your universe was a list (or set) of StarSystem objects. A StarSystem could contain a String for the parent star name, an ordered list of planets and a set of starships. This design approach also lets the code be clearer.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Rj Meyers
Greenhorn

Joined: Aug 07, 2009
Posts: 3
Thank you for the design suggestion--that makes much more intuitive sense than what I was proposing. You can see my own programming background trying to force itself into my thinking. I'm used to Matlab or C and working with heavy data-crunching algorithms and large fields of data that are handled with special care by programs, without much OO consideration. I'll try your suggestion, or something very close to it.

Thank you!
Paul Balm
Ranch Hand

Joined: Dec 13, 2008
Posts: 63
I agree with the design idea in the previous post, but since in that design, the StarSystem object will also contain collections of planets/starships, the general question remains:


If I take 1000 objects from one star system (row) of the Universe list and add them to the end of another star system (row) of the Universe, I will be adding on to that other star system's list--but will I also be inserting into the middle of the Universe? Will I incur the penalty for random insertion?


The answer is no. :-)

Your Universe contains object references to the StarSystems. The actual StarSystems can be stored anywhere in the JVM's memory. So if you're adding 1000 planets/starships to a StarSystem, you're not actually putting this in the middle of the Universe.

A List of Lists in Java has nothing to do with a two-dimensional array in C, C++ or FORTRAN. An a two-dimensional array in Java is more like a List of Lists in Java, than like a two-dimensional array in C.

Cheers-


SCJP 1.4 -- SCJD Java 2 -- OCM JEA 5
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16233
    
  21

I think you can sum this up in CompSci terms as "sparse 2D array with insertions".

The performance penalty in inserting rows in the middle of the list has to do with the fact that you have to move up all the displaced object references ("pointers") in order to open up the slot that will hold the added row. It's "slow", but not as slow as if you were actually moving all the data items themselves.

However, Java has quite a few more ways of storing data collections than just ArrayList, so if that's a bottleneck, you could always try a different container class. A Map can use "row numbers" as keys, for example, and with a Perfect Hash", the insertion penalty is minimal. Though you'd pay for it somewhat when doing sequential row retrievals.


Customer surveys are for companies who didn't pay proper attention to begin with.
Martin Juranek
Greenhorn

Joined: Nov 12, 2009
Posts: 2
If you want traverse all entries in map, you should consider LinkedHashMap. But it has slower put/remove operations, so it is not always way to go.

Btw, if spatial coordinates are important for objects (you want to use queries like "all objects closer than X from StarXYZ"), you should use structures like quad trees.
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16233
    
  21

Some good resources for things like that include Knuth's Art of Computer Programming and the MIT Algorithms textbook.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: List of lists performance question