This week's giveaway is in the EJB and other Java EE Technologies forum. We're giving away four copies of EJB 3 in Action and have Debu Panda, Reza Rahman, Ryan Cuprak, and Michael Remijan on-line! See this thread for details.
Hi, I have an arraylist of 'Address' objects, each object stores a persons name, address details, etc, with appropriate get / set methods. I was wondering what is the best way to search this list. Say for example I want all the people with the surname 'smith'; and there may be many. At the moment I create an iterator of the arraylist and run through each object. At each iteration I say - if object.getSurname.equals("Smith") - and then, if surname does equal smith I add the object to a new arraylist which, at the end of the iteration, is then returned. This works pretty well, and for small lists is no problem. I am worried though that with larger lists this is a slow and tedious way of doing it. Questions are: should I be using a different collection to a List? Is there a better, faster, more efficient way to search through lists? Thanks in advance for any help / advice Joel P.S. If this question is more appropriate for another dicussion group, please let me know, wasn't sure where to ask...
Could you say if you are interested in searching on any other variable besides surname, like city, zip code or something you include in "etc" ?
Joined: Jun 15, 2001
Hi Colin, That's a yes. I would want to be able to search on a number of different variables. at the moment I have a number of methods for example getAddressListBySurname(String theSurname), getAddressListByFirstname(String theFirstname)...
Joined: Aug 06, 2001
Joel, If just one variable were being queried then I would have gone for something like ArrayList, Vector or Hashtable. In Collections you cannot have duplicates in a Set. However as you want various ways to query this I think you could use a TreeMap, providing a suitable Comparator to the constructor. In this way you could build a TreeMap for each search type you plan on doing. After adding the items to the TreeMap there is a SubMap() method you can call to return a particular value or range of values. You can get a Collection (and hence iterator) from the returned SortedMap's values() method. phew! Of course all this is theoretical as I have never used a TreeMap before, but I think it should work - what do the moderators think ??
I think the TreeMap idea is a good one, but the memory overhead might get kind of high, especially since you would need a different TreeMap for each quality you wanted to search on. With the TreeMap scheme, I think you would experience memory problems much sooner (smaller list) than you would experience performance problems with your current linear search. In addition, adding elements to a TreeMap is a log(n) operation (I think), compared to the constant (usually) time of adding to an ArrayList. This means that it might take longer to build each Tree than it is currently taking to build your list. On the other hand, resizing of ArrayLists during adding of elements can dry up the performance savings as compared to other non-constant time collections. Therefore, building an ArrayList as compared to a TreeMap might not save much time unless you knew ahead of time the size (roughly) of the list. It all depends on how often you need to do the searching and how much memory you have. If you need to do constant searching in real time, give the TreeMap idea a try. If you only need to search once or twice then you might be better off just sticking with the linear search.
Unless you specifically want the sort order, a HashMap is more efficient than a TreeMap (constant time as opposed to log(n) for element insertion and searching). But does it make sense? How many such searches are you doing? Constructing an index will take O(n) or O(n log(n)) time. You can probably do at least half a dozen simple linear searches in the time it will take to construct a HashMap; even more for a TreeMap. Are you searching on all fields all the time, or do you tend to do a bunch of searches using one field only? In the first case you'd need to maintain maps for all the properties (expensive), in the last case you would, right before firing up the searches, index just the property you need. - Peter