Win a copy of Svelte and Sapper in Action this week in the JavaScript forum!
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
• Campbell Ritchie
• Ron McLeod
• Paul Clapham
• Bear Bibeault
• Junilu Lacar
Sheriffs:
• Jeanne Boyarsky
• Tim Cooke
• Henry Wong
Saloon Keepers:
• Tim Moores
• Stephan van Hulst
• Tim Holloway
• salvin francis
• Frits Walraven
Bartenders:
• Scott Selikoff
• Piet Souris
• Carey Brown

# Iterator better for that?

Ranch Hand
Posts: 580
I have two lists listA and listB,
which contains the same kind of objects.

The size of A is shorter than the size of B.

Now, I want to know, if there are elements in A which also exists in B. (If so, at first store these elements in listC. After the loop, delete this elements in listA):

What are the differences between these two methods:

I guess, there is no difference, if I loop from listA or listB at first:

for example:

listA size=6000
listB size=15000

so the contains-Method must loop through 6000*15000 (=15000*6000) loops.

Is this best practice or should I use the Iterator?
[ October 27, 2008: Message edited by: nimo frey ]

Saloon Keeper
Posts: 22485
151
The actual math is a little more complicated than that. First, the number of passes to check whether the item from one list is in the other list is probably going to be an average of half the size of the list being searched. Secondly, if you delete items from that list, you reduce the size of the search.

So a little statistical, a little bit arithmetical.

If you can get the search target ordered however, you can speed things up immensely, since an ordered list can be searched using a binary algorithm, reducing the average search time exponentially.

When both lists are unordered, I can envision modifying one of the stock sort algorithms so that when you have the case of 2 equal entries, the end-result will delete the existing one and refuse to add the duplicate. However, if you're using a heaping type of sort, that will require some extra tree-balancing, so it's a little tricky.

nimo frey
Ranch Hand
Posts: 580

if you delete items from that list, you reduce the size of the search

So I use the Iterator, to be able to delete while I am in the loop (to reduce the count of the next loop). Am I right?

If you can get the search target ordered however, you can speed things up immensely

I have a ArrayList..to make at first the lister ordered costs not more than leaving it?

Ranch Hand
Posts: 96
nimo, careful with your 'remove' methods in the Collections framework. You used

listA.remove(listC);

The remove method removes the listC from your collection. I think you want to remove all of the elements contained in listC from your collection. In this case use

[ October 28, 2008: Message edited by: Paul Beckett ]

Paul Beckett
Ranch Hand
Posts: 96
I think either of the two methods you've described can be replaced with:

This is much easier to understand than trying to write your own version of the method.

nimo frey
Ranch Hand
Posts: 580
oh yes thanks, I have used removeAll()..only typed here false.

Rancher
Posts: 3626
40
Since you're posting this in Performance, I expect you can get much better performance if you use a HashSet instead of (or in addition to) the Lists. Fundamentally, the algorithms described so far are all O(Na * Nb), while with a HashSet you can get O(Na + Nb).
[ October 28, 2008: Message edited by: Mike Simmons ]

Rancher
Posts: 4686
7

Originally posted by Mike Simmons:
while with a HashSet you can get O(Na + Nb).

Which by definition is the same as O(Na) for suitable constants.

But in general, a HashSet/HashMap should be more of O(1) rather than O(n)

Mike Simmons
Rancher
Posts: 3626
40
[Pat]: Which by definition is the same as O(Na) for suitable constants.

Well, we don't know whether Na or Nb is larger - they may both be significant. It's premature to write off either one as a constant, I think - see below.

[Pat]: But in general, a HashSet/HashMap should be more of O(1) rather than O(n)

Basic operations like add(), remove(), contains() are O(1). But operations like addAll() and removeAll() are obviously O(N). And of course, calling add(), contains() or remove() N times would be O(N).

The point is, if Na and Nb are of comparable size (call them both N), then the solutions given in this thread are O(N^2), while a HashSet will be O(N). Big win.
[ October 28, 2008: Message edited by: Mike Simmons ]

Ranch Hand
Posts: 1324
so which method do you guys think has better performance when processing large number of complex objects(e.g. Value Object/Data Transfer Object)?

Mike Simmons
Rancher
Posts: 3626
40
Using a HashSet, definitely.

Pat Farrell
Rancher
Posts: 4686
7

Originally posted by Mike Simmons:
[Pat]: Which by definition is the same as O(Na) for suitable constants.

Well, we don't know whether Na or Nb is larger - they may both be significant. It's premature to write off either one as a constant,

No, O(50) == O(500)

Look at the definition of big Oh.

Mike Simmons
Rancher
Posts: 3626
40
But neither Na nor Nb is necessarily a constant. Naive definitions that assume there's only one variable will not help much here.

For non-HashSet solutions the performance is O(Na * Nb), and it makes a huge difference whether we vary the two quantities independently or not. So it would be a disservice to simplify this to either O(N) or O(N^2) - both are possible, as are some in-between solutions.

For the HashSet solution the performance is O(Na + Nb). Yes, we could call this O(N) I suppose, but then the question arises how does that N compare to the Na and Nb used in the previous statement about non-HashSet solutions. For me, it seemed much clearer to use comparable notation for both: one is O(Na + Nb), and the other is O(Na * Nb).

 We must storm this mad man's lab and destroy his villanous bomb! Are you with me tiny ad? Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton