File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
A friendly place for programming greenhorns!
Big Moose Saloon
Register / Login
best performance ?
Joined: Dec 23, 2003
Nov 03, 2010 12:08:22
Assume I have a txt file containing ten millions phone number, record unsorted and duplicate, Now I want
1. list top 20 duplicate phone numbers
2. sorted it
3. list duplicate frequency, like one phone number has 200 duplicate.
Which way has best performance? Database is not in the option list.
Author and all-around good cowpoke
Joined: Mar 22, 2000
Nov 03, 2010 12:48:09
1. devise a way to turn the text of a phone number into a Java primitive, probably a long,
2. scan list adding the derived longs to a long array
3. sort the array
the remainder should be obvious.
Joined: Jun 26, 2002
Nov 04, 2010 10:00:33
If it is in a file already you may not need to use
. You could also consider using unix utilities to sort and check for dupes.
- a fast, free open source performance tuning api.
JavaRanch Performance FAQ
I agree. Here's the link:
subject: best performance ?
telephone directory & collections ...
Mark Hansen - Testing SOA
How to store unique elements in List?
Best algorithm to find the duplicate number
Native contact list
All times are in JavaRanch time: GMT-6 in summer, GMT-7 in winter
| Powered by
Copyright © 1998-2014