aspose file tools*
The moose likes Performance and the fly likes How to do better than the retainAll method of the List or Set class? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "How to do better than the retainAll method of the List or Set class?" Watch "How to do better than the retainAll method of the List or Set class?" New topic
Author

How to do better than the retainAll method of the List or Set class?

Stephane Clinckart
Ranch Hand

Joined: Oct 21, 2003
Posts: 89
Hi,

I want to improve a method I use.

Let me show you a part of the code...



Any Idea how to do better???

Thanks a lot.
ganesh roti
Greenhorn

Joined: May 16, 2010
Posts: 17

It depends what exactly you want to do and how you are filtering out. The same code i restructured as below and gave a major performance benfit. Try your luck..

public class ListUtilTest
{

private final Collection<String> multipleOf2 = new ArrayList<String>();
private final Collection<String> multipleOf3 = new ArrayList<String>();

private final Map<Long, String> multipleOf2_Map = new HashMap<Long, String>();
private final Map<Long, String> multipleOf3_Map = new HashMap<Long, String>();
private final Collection<String> commonCol = new ArrayList<String>();

public ListUtilTest()
{
initCollections();
}

private void initCollections()
{
for (long j = 1; j < 100000; j++)
{
if (j % 2 == 0)
{
//multipleOf2.add("" + j);
multipleOf2_Map.put(j, (""+j));
}
if (j % 3 == 0)
{
//multipleOf3.add("" + j);
multipleOf3_Map.put(j, (""+j));
if (multipleOf2_Map.containsKey(j))
{
commonCol.add("" + j);
}
}

}
}

public void testGetCommonMultipleOf2_3()
{
final ListUtil<String> listUtil = new ListUtil<String>();
//final Collection<String> commonElements = listUtil.getCommonElements(multipleOf2, multipleOf3);
System.out.println(commonCol.size());
}

public static void main(String[] args)
{
final long msSecConvFactor = 1000;
final long startTime = System.currentTimeMillis();


final ListUtilTest lUTest = new ListUtilTest();
lUTest.testGetCommonMultipleOf2_3();

final long endTime = System.currentTimeMillis();

final long diffSec = ((endTime - startTime) );
System.out.println("Time required: "+diffSec+" milliSec");


}
ganesh roti
Greenhorn

Joined: May 16, 2010
Posts: 17
But if you want to make generic solution by passing two collections to a set and expecting the comoon elements then your code is the most optimistic in that case ..
Stephane Clinckart
Ranch Hand

Joined: Oct 21, 2003
Posts: 89
ganesh roti wrote:But if you want to make generic solution by passing two collections to a set and expecting the comoon elements then your code is the most optimistic in that case ..


I want effectivly to have a generic code...

Another part of this class is:


It works... but it is not performant at all !!!

It take me 18 seconds to run the first sample !

I suppose it can be optimized !

This code (that don't do the same stuff) is optimized...
--> it take only half of a second with the same collections as in first sample:


Thanks for help.
R van Vliet
Ranch Hand

Joined: Nov 10, 2007
Posts: 144
Here you go, about 2,500 times faster on my computer and should be relatively generic :
D. Ogranos
Ranch Hand

Joined: Feb 02, 2009
Posts: 214
You can also try this:


Saifuddin Merchant
Ranch Hand

Joined: Feb 08, 2009
Posts: 606

D. Ogranos wrote:You can also try this:




Similar to the collection interface retain all - except that its in reverse. I don't think it would give any better performance. The main bottleneck would be the 'contains' method that iterates over all the elements of 'elems' for each element of the 'first' set. Right?


Cheers - Sam.
Twisters - The new age Java Quiz || My Blog
D. Ogranos
Ranch Hand

Joined: Feb 02, 2009
Posts: 214
Sam Mercs wrote:Similar to the collection interface retain all - except that its in reverse. I don't think it would give any better performance. The main bottleneck would be the 'contains' method that iterates over all the elements of 'elems' for each element of the 'first' set. Right?


Actually its performance is better. The main trick is to put one of the lists into a (Hash)Set, so that the contains() method gets constant complexity. It might also be worth it to check which list is shorter and then put one or the other into the Set...I haven't tried around with that option.
Also the retainAll() seems to keep the list elements in order, while this method shuffles them around. So sorting may be neccessary.

Some quick tests compared to the original method with retainsAll() (with two lists of each 100 strings, and 10 common elements):
10.000 calls: 125ms vs. 203ms
250.000 calls: 1077ms vs. 4508ms
1.000.000 calls: 4009ms vs. 18079ms
Saifuddin Merchant
Ranch Hand

Joined: Feb 08, 2009
Posts: 606

D. Ogranos wrote:
Sam Mercs wrote:Similar to the collection interface retain all - except that its in reverse. I don't think it would give any better performance. The main bottleneck would be the 'contains' method that iterates over all the elements of 'elems' for each element of the 'first' set. Right?


The main trick is to put one of the lists into a (Hash)Set, so that the contains() method gets constant complexity. It might also be worth it to check which list is shorter and then put one or the other into the Set...I haven't tried around with that option.
Also the retainAll() seems to keep the list elements in order, while this method shuffles them around. So sorting may be neccessary.



The OP also does the same thing,

final Set<T> commonSet = new HashSet<T>(smallestList);
commonSet.retainAll(biggestList);

I don't quite understand why your implementation would give a better performance compared to the OP - as I think they do exactly the same thing. Maybe I should try running the code to figure it out. Will post if I do....

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How to do better than the retainAll method of the List or Set class?