• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

ArrayList

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Dear Sir,

How to count duplicate entries in an ArrayList or Vector.

Regards joby
 
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Use the java.util.Collections.frequency() method.
 
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Edwin Dalorzo:
Use the java.util.Collections.frequency() method.



Edwin,

I haven't found this method in this class..

Is this is not is JDK 1.4

You must be talking about JDK 1.5
 
Juhi Gupta
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by francis joby:
Dear Sir,

How to count duplicate entries in an ArrayList or Vector.

Regards joby



Infact you can check by this simple program if there is any duplicate entries in Vector etc..



Hope it helps.....

Or if you want to count duplicate entries, then you may use of Iterator.
 
Ranch Hand
Posts: 1252
Spring Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Good Method Juhi,

But it's not upto expectations.

Dear francis joby,

You may use following code.



Hope it helps.
[ October 04, 2006: Message edited by: Ankur Sharma ]
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You overlooked the possibitly of null items.

A JDK 1.4 implementation of the frequency() method developed by Sun in JDK 1.5 would be

 
Shaan Shar
Ranch Hand
Posts: 1252
Spring Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well Good point, I just forgot,

But I just want to bring one more thing to notice,

We may also need to override the equals method for a custom object. Otherwise it may leads wrong solution.

Am I correct.??
 
Shaan Shar
Ranch Hand
Posts: 1252
Spring Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Let me show one example to prove my earlier thread.



Now this will give output 0..

So that mean you also need to override the equals method.
 
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well Juhi and Ankur,

Your code looks good but why you are concentrating only on String1. If list will have duplicate values other than String1 then your program will not count those..

Hope it is clear.
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Try this program:

 
Shaan Shar
Ranch Hand
Posts: 1252
Spring Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well Dear Rathi,

What you expect output of your program, could you please let us know the explanations.

I think the output should be 6 but this program is giving 4 as output.
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ankur Sharma:
Well Dear Rathi,

What you expect output of your program, could you please let us know the explanations.

I think the output should be 6 but this program is giving 4 as output.



Yup. It will count the number of duplicate elements (as OP specified).

Why it should be 6??? There are only 4 duplicate elements: String1, String2, String3, String4.
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Do you realize that you just wrote 3 methods and around 60 lines of code just to determine if there is a duplicate item in a List?

I would still stick to the Sun implementation of the frequency method. It is just 18 lines of code and it does the work.

I also liked the use of indexOf() and lastIndexOf() of a previous post. It may even be faster to determine duplicates on large lists.

It is well known that items used in Collection should implement equals() and hashCode(). This should not be issue. And on fulfilling this requirement the methods should work on any object, not just String as seem to suggest a previous post.
[ October 05, 2006: Message edited by: Edwin Dalorzo ]
 
Ranch Hand
Posts: 94
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Edwin Dalorzo:
[QB]Do you realize that you just wrote 3 methods and around 60 lines of code just to determine if there is a duplicate item in a List?

I would still stick to the Sun implementation of the frequency method. It is just 18 lines of code and it does the work.



It is called learning how to program.
 
Shaan Shar
Ranch Hand
Posts: 1252
Spring Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Robert Hill:


It is called learning how to program.



Originally posted by Edwin Dalorzo:
Do you realize that you just wrote 3 methods and around 60 lines of code just to determine if there is a duplicate item in a List?

I would still stick to the Sun implementation of the frequency method. It is just 18 lines of code and it does the work.



Well Edwin,

As we already mentioned we are not aware of JDk1.5 so I wrote my own method and same done by Rathi, I would also like to use Sun's method Frequency, but what if We don't have it. ???
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


As we already mentioned we are not aware of JDk1.5 so I wrote my own method and same done by Rathi, I would also like to use Sun's method Frequency, but what if We don't have it. ???



Go up in the posts, you will see that there is already one post offering a JDK1.4 implementation of the frequency algorithm. After all it is not a state secret. It is in the src.zip file that comes with the JDK.

Look, I know we programmers take a lot of pride of our code, and I apologize to all of you if somehow I offended anybody when I criticized one of the previously posted algorithms.

I am fairly new at Java, although I have accumulated good time as a programmer. Recently I read Effective Java (by Joshua Bloch) for the first time and I realized of all those algorithms I used to take pride of that were not so good at all compare to all those nifty ideas printed in the book. I and felt happy that somebody wrote that book to teach me how to do things better.

I probably couldn't tie Mr. Bloch shoes, and I may know less Java than many of you, but I do know we programmers like to complicate things more than necessary and we do love reinventing the wheel. We tend to be creative people after all. That's good. But we also have to develop the skill of simplicity and the ability to recognize that simple is better, is less cost, less money, less bugs, less maintenance, it is easy to read and easy to understand.

It simply appears, to my naked eye, that the posted algorithm was much more complicated than the ones previously posted. That�s all. My comments never had the intention to hurt any body's pride. I just want to say: why don't we try to make this simpler? What's the point in posting an algorithm that does the same than several others previously posted, but in a far more complicated way? Are we not misleading the person who posted the question?

That's something our bosses ask us as technologists every day? Why do you want to migrate to a more complicated more expensive database? Why do you want to remake a legacy system that is already working pretty well after years of IT evolution? And when they ask, we better have a good answer or we will get nothing of their money to buy those nice 21st century toys that everybody is deploying on their companies.

Again, I apologize if you found my words offensive.

[ October 05, 2006: Message edited by: Edwin Dalorzo ]
[ October 05, 2006: Message edited by: Edwin Dalorzo ]
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Also, note that all the algorithms so far discussed have O(n^2) performance, which is really unacceptable for large lists. Here's a vastly faster (and vastly simpler and vastly shorter) O(N) algorithm:



The folks who provided longer, slower versions might want to read this.
 
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I very much like Ernest's use of the features already in the Collections. The OP wanted a count of duplicates ... I wondered if Set.addAll(someCollection) and checking the size difference wouldn't do it.
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I did the following implementation of the Ernest's algorithm:



I run it on BeanShell and with a List of 10.000 items all of them, but two, containing new String("Obi-wan") and the other two, randomly located containing the new String("Anakin") and it took an average of:

120 ms with the duplicates check
170 ms without the duplicates check.

The same algorithm using the JDK frequency method and run on BeanShell several times takes an avergage of

341 ms with the duplicates check.

Like this:



I think Ernest has a good point.
 
Ranch Hand
Posts: 1078
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Stan James:
I very much like Ernest's use of the features already in the Collections. The OP wanted a count of duplicates ... I wondered if Set.addAll(someCollection) and checking the size difference wouldn't do it.



Unfortunately what "count" means seems ambiguous. Is it the number of duplicate entries for one item? The number of items that occur one or more times? The sum of all the entries that have a duplicate? Am I missing something here or are we tossing around solutions without any concrete requirements?
 
Ernest Friedman-Hill
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ken Blair:


Unfortunately what "count" means seems ambiguous. Is it the number of duplicate entries for one item? The number of items that occur one or more times? The sum of all the entries that have a duplicate? Am I missing something here or are we tossing around solutions without any concrete requirements?

 
Ernest Friedman-Hill
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ken Blair:


Unfortunately what "count" means seems ambiguous. Is it the number of duplicate entries for one item? The number of items that occur one or more times? The sum of all the entries that have a duplicate? Am I missing something here or are we tossing around solutions without any concrete requirements?



Yeah, since there was already some argument going on, I didn't want to muddy the waters even more; but that's why the body of my loop is a comment. Depending on what you were looking for, you could do different things.

As Stan alludes to, if the requirement is actually "if you discarded duplicates, how many items would you discard?" you could just write

int duplicateCount = theList.size() - new HashSet(theList).size();
 
Stan James
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Requirements area PITA, no? Life would be so much easier if our customers would just accept what we give them. Or learn to disambiguate. And never change their minds.
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Edwin Dalorzo:


Go up in the posts, you will see that there is already one post offering a JDK1.4 implementation of the frequency algorithm. After all it is not a state secret. It is in the src.zip file that comes with the JDK.

Look, I know we programmers take a lot of pride of our code, and I apologize to all of you if somehow I offended anybody when I criticized one of the previously posted algorithms.

I am fairly new at Java, although I have accumulated good time as a programmer. Recently I read Effective Java (by Joshua Bloch) for the first time and I realized of all those algorithms I used to take pride of that were not so good at all compare to all those nifty ideas printed in the book. I and felt happy that somebody wrote that book to teach me how to do things better.

I probably couldn't tie Mr. Bloch shoes, and I may know less Java than many of you, but I do know we programmers like to complicate things more than necessary and we do love reinventing the wheel. We tend to be creative people after all. That's good. But we also have to develop the skill of simplicity and the ability to recognize that simple is better, is less cost, less money, less bugs, less maintenance, it is easy to read and easy to understand.

It simply appears, to my naked eye, that the posted algorithm was much more complicated than the ones previously posted. That�s all. My comments never had the intention to hurt any body's pride. I just want to say: why don't we try to make this simpler? What's the point in posting an algorithm that does the same than several others previously posted, but in a far more complicated way? Are we not misleading the person who posted the question?

That's something our bosses ask us as technologists every day? Why do you want to migrate to a more complicated more expensive database? Why do you want to remake a legacy system that is already working pretty well after years of IT evolution? And when they ask, we better have a good answer or we will get nothing of their money to buy those nice 21st century toys that everybody is deploying on their companies.

Again, I apologize if you found my words offensive.

[ October 05, 2006: Message edited by: Edwin Dalorzo ]

[ October 05, 2006: Message edited by: Edwin Dalorzo ]




Not at all any hurt Edwin. In fact, I am thankful to you for pointing this.

I knew that my code is not optimized as I was not using (because not knowing) already built efficient APIs but let me check others� code�s output and efficiency (time).
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Edwin Dalorzo:
Do you realize that you just wrote 3 methods and around 60 lines of code just to determine if there is a duplicate item in a List?



Well, not exactly. My code is not just for determining 'is there any duplicate item present or not'. It actually counts the number of items which are appearing more than once.
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Edwin Dalorzo:
I did the following implementation of the Ernest's algorithm:



I run it on BeanShell and with a List of 10.000 items all of them, but two, containing new String("Obi-wan") and the other two, randomly located containing the new String("Anakin") and it took an average of:

120 ms with the duplicates check
170 ms without the duplicates check.

The same algorithm using the JDK frequency method and run on BeanShell several times takes an avergage of

341 ms with the duplicates check.

Like this:



I think Ernest has a good point.



Hi Edwin,

Did you run my code? How much time it is taking?

Anyway, the following program finds the time taken for all three algorithms for the same list of 100000 items.



And the sample outputs are:



Please let me know if I am doing something wrong.

It shows that my algorithm is taking far less time than other two.

Please comments.
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
One more thing is, other two algorithms are not giving the correct (?) output.

I think, OP wants to count the number of duplicate items or the number of items which are appearing more than once.

For example, list is having following items: a, b, c, a, a, b. So the output should be 2 as only two items are duplicate (a and b). But in case of other two algorithms, the output comes 3 (I don�t know by what logic).

But I might be wrong in understanding the requirement.
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ernest Friedman-Hill:
Also, note that all the algorithms so far discussed have O(n^2) performance, which is really unacceptable for large lists. Here's a vastly faster (and vastly simpler and vastly shorter) O(N) algorithm:



EFH, can you please post any good link which explain how to find performance (O(n^2) and O(n) etc) for any algorithm, in a simple language.

 
Shaan Shar
Ranch Hand
Posts: 1252
Spring Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by rathi ji:


EFH, can you please post any good link which explain how to find performance (O(n^2) and O(n) etc) for any algorithm, in a simple language.



Rathi,

You must look into this Algorithm Complexity.

Hope you may find it good.
 
Shaan Shar
Ranch Hand
Posts: 1252
Spring Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ernest Friedman-Hill:
Also, note that all the algorithms so far discussed have O(n^2) performance, which is really unacceptable for large lists. Here's a vastly faster (and vastly simpler and vastly shorter) O(N) algorithm:



The folks who provided longer, slower versions might want to read this.



Well I like this code ...Thankx EFH,

Now I can implement this one in JDK 1.4. very easily..
 
You ought to ventilate your mind and let the cobwebs out of it. Use this cup to catch the tiny ads:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic