Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!

# Linear comparison algorithm

Ian Mcloud
Ranch Hand
Posts: 76
I am putting together a program that will compare the elements in several arrays, with one array holding values entered by the user, and then store all common elements in a final array. When all is said and done, the elements in Comparable[] common will print to the screen, along with the number of comparison that were made. I am trying to create a linear algorithm, so bigO(n). I want to avoid quadratic and so on. Here is what I have thus far:

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:I am putting together a program that will compare the elements in several arrays, with one array holding values entered by the user, and then store all common elements in a final array.

OK. Confusion. What is this: a comparison, or a selection algorithm?

Specifically, what are:
• one array holding values entered by the user (I thought you had several arrays for comparison).
• and
• store all common elements in a final array (doesn't sound like a comparison to me).

• My suggestion: back up, and explain step by step exactly what you want to do - and note: WHAT you want to do, not HOW you want to do it.

Sounds to me like you've rushed to a solution without properly defining the problem.

Winston

Ian Mcloud
Ranch Hand
Posts: 76
Sorry about that. I have comparisons on my mind. Yes, it's a selections algorithm. Basically, I want the user to be able to enter values into an array, which is then compared with the three arrays I have previously established. So, {A, B, C } is compared with say {A, F, G} since A is common to both, it is stored in "common." There will naturally be a certain amount of comparisons that have to be made since I need to traverse each element in each array until I find all elements in common. I can put together a binary search, but I want something linear. I'm also not looking for a complete solution from anyone on here, just an idea if I'm on the right track and where I should be going from here. I was thinking of something like this, though it is not my code and will have to be modified for my program:

Ian Mcloud
Ranch Hand
Posts: 76
It would probably be easier to compare the three arrays I already have, but I want the user to be able to enter his/her own values, which would then be compared against them.

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:It would probably be easier to compare the three arrays I already have, but I want the user to be able to enter his/her own values, which would then be compared against them.

OK, but that is simply an addition to the arrays you already have...right?

If so, then the problem is this (and correct me if I'm wrong):
You want to compare the elements of a number of lists and create a "result" list that includes all the elements that are common to all of them.

Note that I used the word 'list', not 'array'. When you're describing a problem it's best to stick to English. How you translate it to Java is a separate issue - and there are MANY ways to hold lists of values.

Question: What if one or more of your lists contain duplicate elements?

Winston

Ian Mcloud
Ranch Hand
Posts: 76
You are correct.

Ian Mcloud
Ranch Hand
Posts: 76
I want multiple copies if duplicates.

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:I want multiple copies if duplicates.

Right. That makes things slightly more complicated (but only slightly).

Now: You want an algorithm that is "linear". What do you mean by that? You said O(n), but is that n the number of elements or the number of lists? The fact is that you're going to have to check every element in order to include/exclude it, so the best you're going to get is n = #elements; And I suspect it will actually be O(#arrays + #elements) (which is still O(n) by that notation). And your requirement for duplicates will slow things down a bit more.

Question: Are you only allowed to use arrays for your solution, or can you use any class available?

If the latter, I suggest that you look at the LinkedHashMap class, and consider creating a LinkedHashMap of elements as the key, and a "list of counts" (1 count per input array) as the value.

Winston

Ian Mcloud
Ranch Hand
Posts: 76
I can use any object of type Comparable. Big O(n) is worst case scenario. If there are nested loops I might have n^2 or even n^3. So of there are 5 sets with 10 elements then I would have 10(5-1). If its squared, I would have (5-1) * 10^2 That's 50 comparisons for linear and 500 for quadratic.

Ian Mcloud
Ranch Hand
Posts: 76
Sorry I'm typing from my phone.

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:I can use any object of type Comparable. Big O(n) is worst case scenario. If there are nested loops I might have n^2 or even n^3. So of there are 5 sets with 10 elements then I would have 10(5-1). If its squared, I would have (5-1) * 10^2 That's 50 comparisons for linear and 500 for quadratic.

I don't see any reason why it should be quadratic, but you need to count each element, so the absolute minimum it's going to be is
O(#elements)
The only wrinkle is that because you want to keep track of duplicates you need to keep a count for each list/array, so processing that is going to take
O(#counts)
and the fastest way to do that is to keep a count for each array even if it never gets used, so your overall best time is likely to be
O(#elements + (#distinct elements * #lists))
which is still O(n) if n == #elements, because there is no exponent involved.

The main issue you have is that without a "hashed" collection of some description, your "count this element" action will take at least O(log(n)) time (because you have to find it), so without one, the best you can hope for is O(n * log(n)).

Winston

Ian Mcloud
Ranch Hand
Posts: 76
That makes sense actually

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:That makes sense actually

Blimey. I obviously haven't had enough beer...

Winston

Ian Mcloud
Ranch Hand
Posts: 76
Someone else mentioned using a hash table, but I've been told this can be done using a linear search. Unfortunately, I cannot use a hash table as that is outside the scope of the assignment.

Ian Mcloud
Ranch Hand
Posts: 76
What if I sort them first?

Mike Simmons
Ranch Hand
Posts: 3076
14
Then you could do a linear comparison. However, the sort itself would not be linear, so I don't think it's fair to call the overall algorithm linear.

Ian Mcloud
Ranch Hand
Posts: 76
Well if I ensure the static lists are already ordered and I prompt the user to enter values in ascending order, and surround the code with try catch blocks in case they enter it incorrectly, maybe it will work

Ian Mcloud
Ranch Hand
Posts: 76
The pre generated lists that is

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:The pre generated lists that is...

Yeah, but that's kinda cheating (though possibly a reasonable solution ) because all you're doing is offsetting some of the cost of the overall process.

Winston

Ian Mcloud
Ranch Hand
Posts: 76
I know. Lmao, but it seems like there are few alternatives

Ian Mcloud
Ranch Hand
Posts: 76
Nevermind, I don't need to take into account multiple occurrences of the same value(s).

Ian Mcloud
Ranch Hand
Posts: 76
I made some modifications. Now all I'm concerned with is getting the actual results. Frustrated beyond all belief right now.

Output:

Please enter the number of collections you would like to compare: 2

The elements in the collections were compared 0 times with [Ljava.lang.Comparable;@1256ea2 as common elements:

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:I made some modifications. Now all I'm concerned with is getting the actual results. Frustrated beyond all belief right now.

Right, well the reason for that is that you're trying to code your way out of a jam, and that will almost never work; so the first thing to do is to StopCoding (←click).

You're also making thing very difficult for yourself by
1. Mixing and matching object types.
2. Using only arrays.
3. Storing each comparison array separately.
4. Making your "result" array Comparable - indeed, based on what you've got so far, that may not even be possible.

The Comparable interface is meant for comparing like objects (or at least objects from the same hierarchy), but you're mixing numbers in with Strings and furthermore using '==' to do the comparison, which will never work (see the AvoidTheEqualityOperator page).

Furthermore:
• 2 will never compare equal to "2" unless you write a Comparator that forces it to be so - and if you do so you'd better be darn sure you get it right.
• Once Comparators are involved, HashMaps are pretty much out of the window, unless you fancy rolling your own; and that will have a major effect on speed.

• My advice: Stop Coding (see the link above), and write out exactly what you want to do in English and in detail. I'd also suggest that you start out with sets of objects of the same type (eg, Strings), and get that working first.

Winston

Ian Mcloud
Ranch Hand
Posts: 76

Ian Mcloud
Ranch Hand
Posts: 76
I'm getting a return for the number of comparisons, but it's always the same value. I'm also still having trouble getting the common elements among the sets. Getting closer though.

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:I'm getting a return for the number of comparisons, but it's always the same value. I'm also still having trouble getting the common elements among the sets. Getting closer though.

OK,

1. Please DontWriteLongLines. I probably forgot to mention this last time, but it makes your thread very hard to read. I've broken yours up (again).
2. You're still using '==' to compare. AvoidTheEqualityOperator.
3. You've still got individually defined collections, which makes your code very brittle. Suppose you want to add another one some day?

Are you familiar with the Collections API? Because there are a ton of structures that would be a lot better than arrays for doing this. And if you don't need to count duplicates, the best one of all for your "result" would be a HashSet (or LinkedHashSet). That way you don't need to do any comparisons at all.

And BTW, you still have a Comparable[]. Why have you defined it that way?

Winston

Ian Mcloud
Ranch Hand
Posts: 76
Alright, I went to the proverbial drawing board and redid the entire thing.
I realized there are two things I was trying to do, but I was simply not implementing them at all.
For starters, I want to give the user the option to choose how many comparisons they want
to make. They should also be able to enter their own elements.

Also, with long lines are you referring to my text or the code?

Ian Mcloud
Ranch Hand
Posts: 76
By the way, I cannot use Hash tables. More specifically: Each collection will be represented as an array of objects of type Comparable.

Ian Mcloud
Ranch Hand
Posts: 76
This is actually fairly close, because I'm getting the common elements and the number of comparisons, but only for each index. i.e. index 0 in array1 = index 0 in array2.
It is not, however, checking to see if index 1 in array 1 = index 0 in array2. So it's not iterating through each and every index.

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:By the way, I cannot use Hash tables. More specifically: Each collection will be represented as an array of objects of type Comparable.

Fair enough, but I hope you've abandoned any idea of trying to make it linear then, for the reasons that Mike gave you. I think you're still concentrating far too much on the code rather than the problem.

Try this: Switch off your computer and get a deck of playing cards. Shuffle it, and arrange them in sets of lines face down. You don't have to use all 52 cards; in fact, it's much better if you don't (I'd suggest not using more than about half the deck). These are your "input arrays".

Now go through the steps required to process the cards from each of those lines so that you end up with a single line of face up cards containing one card of each value (important: suit doesn't matter, so 7♠ == 7♥).
"Processing" a card involves turning it over, looking at its value, and either moving it to the "result" line or leaving it where it is.
That's basically pretty close to the problem you're trying to solve.

Try it several times, with different numbers of cards and input lines, and see if you can see a pattern emerging. When you do, write it down in English.

My other tip: Get rid of all that input and println() logic from your findCommonElements() method.
That method should be concerned with one thing, and one thing only:
• Take a bunch of input arrays, and return a "result" array containing a single copy of each distinct element in your input ones.
• How those arrays are created, or what happens to the output are none of its business, and are simply cluttering things up at the moment.

That method is the "guts" of your program, so you need to make sure that it's as simple and clear as possible.

HIH

Winston

Mike Simmons
Ranch Hand
Posts: 3076
14
Ian Burres wrote:By the way, I cannot use Hash tables. More specifically: Each collection will be represented as an array of objects of type Comparable.

To be fair, nothing in that red sentence prevents you from using hash tables. If there's a separate requirement that you can't use hash tables, fine - but the sentence you quoted doesn't appear to have anything to do with that.

Ian Mcloud
Ranch Hand
Posts: 76
True Mike. I've gone back to the drawing board. I now have two classes; a driver and the CommonElements class. I'm using predeclared arrays and the findCommonElements method does one thing and thing only. I'll post it shortly. Thanks guys

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:True Mike. I've gone back to the drawing board. I now have two classes; a driver and the CommonElements class. I'm using predeclared arrays and the findCommonElements method does one thing and thing only. I'll post it shortly. Thanks guys

Sounds to me like you're not listening, because you're still preoccupied with how you're going to do this; and you're still banging out code as though it's going to solve your problem - IT ISN'T.

For the last time: To solve this problem you need to STOP CODING (←click) and think.

and the findCommonElements method does one thing and thing only

OK, I suggest you show us that. I expect it not to have a single statement that includes "System.out" or a Scanner method call.

Winston

Campbell Ritchie
Sheriff
Posts: 48917
58
I am moving this discussion as too difficult for “Beginning Java”.

Ian Mcloud
Ranch Hand
Posts: 76
Lol. Not a problem. I guess it is a little more of an intermediate level problem.

Alright, here is what I have designed. The algorithm in findCommonElements is correct, regardless of BigO runtime; however, I am still stuck on the Comparable common, as I cannot figure out how to assign to it the common elements in object1 and object 2.

First class is my Driver/test harness...

And this is the CommonElements class:

My problem exists where I have input my comments, or so I believe.

Ian Mcloud
Ranch Hand
Posts: 76
I took out the static and changed the setter to this.comparisons = comparisons;

Ian Mcloud
Ranch Hand
Posts: 76
Alright, I'm almost there. Now why in the world am I getting this:

The common elements in the arrays are:
[[1, 2, 3, 4, 16]]
[Ljava.lang.Comparable;@a32b
at CommonElements.getIndexOfLowest(CommonElements.java:30)
at CommonElements.findCommonElements(CommonElements.java:46)
at Test.main(Test.java:16)

Ian Mcloud
Ranch Hand
Posts: 76
Nevermind, figured it out... So here is the final bit. I have the proper elements in the array, but I am not getting the correct value for comparisons. Here is the new driver:

Ian Mcloud
Ranch Hand
Posts: 76
Still using the same CommonElements class from above.

Winston Gutkowski
Bartender
Posts: 10417
63
Ian Burres wrote:I took out the static and changed the setter to this.comparisons = comparisons;

<sigh>
</sigh>

However, since you refuse to stop beating this horse:
public Comparable[] findCommonElements (Object[] collections) {
is just plain wrong.
1. Assuming that your method actually does something with collections (which it doesn't), you can't create a Comparable[] from an Object[] without a lot of unsafe casts.
2. The signature doesn't gel with what you want to do, unless you've thrown all your elements to be compared into a single array.

What you've done is to write a method that doesn't jive with its signature; and that's because you're processing a bunch of stuff that is defined outside the method. Now, there's nothing actually illegal with that, but it does increase coupling; and it's very confusing for anyone reading your program.

Questions:
1. What is the parameter 'collections' for? You never use it - indeed, as far as I can see, it is never set to anything in the entire program.
2. What is the findCommonElements() method supposed to do? And don't just re-iterate what you've told us you want it to do; try and explain how it's supposed to work, step by step, in English. For example: what is the maximum possible size of the array that it returns?

Winston