GeeCON Prague 2014*
The moose likes Beginning Java and the fly likes Check List for duplicate entries Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Check List for duplicate entries" Watch "Check List for duplicate entries" New topic
Author

Check List for duplicate entries

Qualtar Demix
Greenhorn

Joined: Mar 14, 2012
Posts: 15
To check if a List has duplicate entries i convert it to HashSet and compare the size for any mismatch. Do you guys have any better approach???!!!


Qualtar Demix
....and I sensed an infinite scream passing through nature
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7892
    
  21

Qualtar Demix wrote:To check if a List has duplicate entries i convert it to HashSet and compare the size for any mismatch. Do you guys have any better approach???!!!

Well, another one would be to sort it and then go through it to see if there are any duplicates. If there are, they will now be adjacent to each other.

For large lists it will probably take longer, but it won't use any extra space (unless you have to keep the original List).

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Stephan van Hulst
Bartender

Joined: Sep 20, 2010
Posts: 3647
    
  16

I actually always use Qualtar's approach, because it has a very good time-complexity, and because I often work with Collections rather than Lists as input.

A solution that is slightly more optimal would be the following:
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7892
    
  21

Stephan van Hulst wrote:I actually always use Qualtar's approach, because it has a very good time-complexity...

Quite agree. I just thought I'd offer an alternative.

Your way is pretty much the way I'd do it; except if you're really interested in speed, it would probably be better to do:
Set<Object> set = new HashSet<>(c.size());

Winston
Qualtar Demix
Greenhorn

Joined: Mar 14, 2012
Posts: 15
Hmm. So, looking at all the approaches it would be safe to say that no-matter-what (or how for that sake), one has to convert it to Set one way or the other. Hence, the worst cast time complexity will always be O(n). Guess i have to close in on O(n).
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7892
    
  21

Qualtar Demix wrote:Hmm. So, looking at all the approaches it would be safe to say that no-matter-what (or how for that sake), one has to convert it to Set one way or the other.

Not if you use the method I suggested above.

Hence, the worst cast time complexity will always be O(n). Guess i have to close in on O(n).

True. Mind you, that only gives you an indication of scalability. The sorting approach probably works out at O(n log(n)), but for small lists (< 20 items?) it may actually be quicker than forcing elements into a Set. However, the only way to work that out would be to benchmark it.

It should also be added that it will only be O(n) if
(a) the Set used is a hashed Set, and
(b) your elements have good hashes.

Winston
Stephan van Hulst
Bartender

Joined: Sep 20, 2010
Posts: 3647
    
  16

Winston Gutkowski wrote:Quite agree. I just thought I'd offer an alternative.

Your way is pretty much the way I'd do it; except if you're really interested in speed, it would probably be better to do:
Set<Object> set = new HashSet<>(c.size());

Winston


Yeah, I like the sorted list approach if memory is constrained. And I always forget passing the size to the set's constructor . For large collections it should make a large difference.
 
GeeCON Prague 2014
 
subject: Check List for duplicate entries