This week's book giveaway is in the OCPJP forum. We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line! See this thread for details.

I'm trying to write a fast algorithm that will locate two duplicate values in an array of arbitrary size. I've written the following code, in which an array is created, then filled with values equal to each index; then one index is made equal to another index. That's simple enough.

The problem is that I can't see an easier way of locating the duplicate entry except by looping through the entire array, comparing each index with every other index until the match is found. Is there a more clever way of doing this?

(The values in the array could be *ANYTHING*; I'm just using the indices as the values for convenience here.)

This is one reason why so much effort has been spent to create sorting algorithms and ways of keeping lists sorted while they are built -- there are various ways to search a sorted list faster.

First of all, j only needs to start at i + 1. After all, if i == 5 then you've already compared it to element 1 when i was 1.

You could sort the array first, then you need just one loop. Because they are sorted equal elements are located next to each other. Let your loop end one index earlier (i < x.length - 1) so you can compare x[i] and x[i + 1].

Copying the array, sorting with a merge sort (or quicksort for primitives) and searching for duplicates will run in O(n log n) time, whereas repeated linear searches are more-or-less equivalent to bubble sort, which runs in quadratic time. So I like Rob's suggestion.

Another way to do it is to design a binary tree and use it as a sorted set. Add all the elements in the array to it. Get it to return true or false depending whether the element has been added; returning false suggests you have hit a duplicate. That will also run in O(n log n) time. Or you can look at the Map pages in the Java™ Tutorials; there is an example about counting words, which you can adapt to the present problem. Anything which returns a count of 2 is a duplicate, triplicates will return 3, etc. Adding to a Map runs in amortised constant time, so the whole procedure will run in linear time.

All these suggestions depend on the elements not changing their state during the counting process, otherwise they will be counted wrongly, or (in a Map) will disappear mysteriously, never to be found again. (Unless they change their state back, when they will just as mysteriously reappear.)

With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.