Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!

# Fast clustering of number pairs

Salman Jamali
Greenhorn
Posts: 5
Hi everyone.

Foa, this is my first javaranch post.. my apologies for not following any message formats.

I need a little help with writing a simple but efficient in terms of speed (not neccesarily space) algorithm that takes in a list of pairs (Point objects), and returns an ArrayList of ArrayLists of Integers.

Sample Input:
2,4
2,5
3,1
1,6
5,9
39,56

Output:
2,4,5,9
1,3,6
39,56

My problem is a little more complicated (order is important, etc.), but I'll try to extend the solution I am assisted in the core problem here. thanks.

Campbell Ritchie
Sheriff
Posts: 48453
56
Welcome to JavaRanch

Sounds more like a beginner's question.
It is quite easy to make a List of Lists.
You will need to go through the Java Tutorials about collections, and a look at the Scanner class for reading would help. You might need to go through the regular expressions section to find out whether the comma is a meta-character; I can't remember that.

Salman Jamali
Greenhorn
Posts: 5
Actually, i don't need a Scanner.. or regular expressions; i've already computed Point objects, as I wrote earlier. List of List of List of.. so true it's quiet easy, and frankly speaking, it ain't wat I asked for. thanks a lot anyways

fred rosenberger
lowercase baba
Bartender
Posts: 12087
29
so what exactly DO you need help with? I don't see how you get your output from your input. Maybe that's what 'fast clustering' means, but even so, it would help folks greatly if you explained better exactly what you need.

Paul Clapham
Sheriff
Posts: 20771
30
It sort of looks to me like there's a graph whose nodes are integers and whose edges are what Salman calls "Points", namely pairs of integers. And he wants to get all the connected components of that graph expressed in a certain form. There might be more to it than that, though, there's only one example.

Ernest Friedman-Hill
author and iconoclast
Marshal
Posts: 24208
35
• 1
2,4,5,9 are a cluster because all these numbers are connected by pairs. 1, 3, 6 likewise (the pairs 3, 1 and 1, 6 connect all these numbers. 39,56 are alone because they're each just mentioned in that one pair together.

Here's one algorithm, which runs in n ln n time (the time to sort the pairs in order):

After all this, each array element will contain the lowest numbered point in the cluster it belongs to.

Salman Jamali
Greenhorn
Posts: 5
• 1
Originally posted by Ernest Friedman-Hill:
2,4,5,9 are a cluster because all these numbers are connected by pairs. 1, 3, 6 likewise (the pairs 3, 1 and 1, 6 connect all these numbers. 39,56 are alone because they're each just mentioned in that one pair together.

Exactly .. sorry if I wasn't clear earlier. Ernest, I'll try your code in a few hours, thanks a lot!

Ernest Friedman-Hill
author and iconoclast
Marshal
Posts: 24208
35
Originally posted by Salman Jamali:

Ernest, I'll try your code in a few hours, thanks a lot!

I've used this algorithm to find contiguous connected surfaces on a 3D polygonal model (my "pairs" are edges, and my "points" are faces.)

priyanka kusuma
Greenhorn
Posts: 22
hey guys now i am also facing the same problem can you please tell me how to cluster the number pairs here my input should be taken from a text file

Jesper de Jong
Java Cowboy
Saloon Keeper
Posts: 15208
36
Welcome to the Ranch!

Ernest has explained very clearly above how to do it. Now you just have to write some Java code to implement the algorithm. We're not going to give you a ready-made solution; the Ranch is a place where we like to help you to learn to program in Java, and if we would just give you the solution you would not learn anything.

Please try writing code for this algorithm yourself. If you get stuck somewhere, show us your code an explain where you're stuck, then we'll help you get further with it.

priyanka kusuma
Greenhorn
Posts: 22
hey this is the code i have tried so far

my problem is i am not able to insert the number pair in to another array list if it is not present in the arraylist that is already there

Campbell Ritchie
Sheriff
Posts: 48453
56
priyanka kusuma wrote: . . . my problem is i am not able to insert the number pair in to another array list if it is not present in the arraylist that is already there
What does that mean?
I suggest you go through the code you posted, and work out under which circumstances each of those if blocks will be entered.

I went back to your post add added code tags, which you should always use. It usually makes the code look much better, but here it highlights your inconsistent indentation.

priyanka kusuma
Greenhorn
Posts: 22
hey guys i am sorry actually i am new to java and also new to these forums so i am extremely sorry
and about the code i have posted here is initially i am declaring an array list and have to place the 1st number pair in to the array list from then i have to check whether any of the other number pairs are there or not ,but my problem is when a new number pair comes which is not in the list then i have to store that number pair in another array list which i dont know how to do and also can any one tell me how to take input from a text file which contains number pairs in the form (1,2) because when i am splitting the text it says it only accepts string and my array list is declared as integer.please help me guys its a kind of urgent

Campbell Ritchie
Sheriff
Posts: 48453
56
The only part of that I could understand was “urgent”, a word we never use.

priyanka kusuma
Greenhorn
Posts: 22
hey i dont know how to ask you people my problem it is not about urgent i am just stuck with this code since 1 week and i dont know where i am doing wrong .
you guys just tell me i have the input file
number pairs .txt:
4,29
34,4
1,8
now my output has to be
4,29,34
1,8
and my problem is i am able to store 4,29,34 in one arrray list but i dont know how to insert 1,8 in to another array list with my code 1,8 is also inserted in to the previous array list resulting my output as single cluster with all the number pairs in one array list i.
i am not asking you people to give me the complete code but just tell how to insert the new number pair in another array list

Campbell Ritchie
Sheriff
Posts: 48453
56
How are you supposed to turn 4,29  34,4 into 4,29,34? You can’t do anything until you understand that.

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:hey i dont know how to ask you people my problem it is not about urgent i am just stuck with this code since 1 week and i dont know where i am doing wrong...

Almost certainly because you're coding instead of thinking.

What is it exactly that causes you to decide whether the numbers in a pair belong in a particular List? (or indeed, a new one). Write it out in English (or your native language) and try to think of all the possible scenarios that might arise. If you need to, go out and buy:
• lots of paper
• lots of pencils (and a sharpener)
• an eraser
• because these are the programmer's friend.

And don't write another line of Java code until you completely understand the problem.

Winston

priyanka kusuma
Greenhorn
Posts: 22
Campbell Ritchie wrote:How are you supposed to turn 4,29  34,4 into 4,29,34? You can’t do anything until you understand that.

hey what i mean is if any pair has common numbers then all those pairs must form in to a group and this is the code i have tried so far

but my problem is if my input is:
1 2
2 4
7 8
the expected output is:
1 2 4
7 8
but my output is
1 2 4 7 8
i am unable to store 7,8 in another arraylist

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:hey what i mean is if any pair has common numbers then all those pairs must form in to a group and this is the code i have tried so far...

We understand what you've tried, so there's no point in simply repeating it.

What we've been trying to show you is where it's wrong (or at least where your approach is wrong), and I don't see any evidence that you've taken any advice.

The fact is that if you're given a Point, you need to check every list that you've created to see whether it belongs. It also seems highly likely to me that the Lists you create will be dependent on the order in which you receive your Points, unless you sort them first. Either that, or you will need to make several passes through your list of Points.

So, once again, why don't you back up and explain exactly what it is you want to do, including why you want to do it - is it, for example, what Ernest suggested: a method for finding connected surfaces?

Winston

priyanka kusuma
Greenhorn
Posts: 22
Hey Winston as you said i am explaining my problem from first
Actually what i am doing is i have 110 text files now i have to construct a similarity matrix which shows the number of common elements between 2 files after constructing the similarity matrix i have to consider the maximum matrix value and the respective (row,column) is my number pair similarly i have to find all the (row,col) pairs until all the values in the matrix are 0.Now i have to cluster these pairs i.e, as i said earlier i should group all the pairs having atleast one common element and i have completed my coding till finding out (row,col) pairs and what i need is to form clusters.
I hope now you have got clear idea of my code

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:Hey Winston as you said i am explaining my problem from first
Actually what i am doing is i have 110 text files now i have to construct a similarity matrix which shows the number of common elements between 2 files

No you didn't, because this is the first time you've mentioned a "similarity matrix" (and I have absolutely no idea whether this is related to the sort of similarity matrix described here).

You also haven't explained how you deal with ambiguous results. For example, given:
5,3
1,2
2,3
where does the 3rd pair go?
You will presumably have two lists at the point where you get the 3rd pair, so the result could be either
5,3,2
1,2
or
5,3
1,2,3

One last time: You must explain ALL the rules - precisely - for yourself as well as us.

I hope now you have got clear idea of my code

No. See above.

Winston

priyanka kusuma
Greenhorn
Posts: 22
Hey Winston if my input pairs are
5,3
1,2
2,3
output will be
5,3
1,2 because as it comes to the 3rd pair as both the elements are present it does nothing
and here is the algorithm:
1.Take the 1st pair as input from text file and store it in an arraylist
2.now take the next pair from file and check for these conditions:
1. if both the numbers are not present in the arraylist then create a new arraylist and store these two elements in the new array list
2. if any one of these numbers is present in any of the array list then add the other number to that array list.here we have to check in all the arraylists that have been created
3.Continue step 2 until it reaches the end of the file
and i have 2 issues to get resolved:
1.how to take input two numbers at a time from the text file
2.as i have posted the code earlier i am unable to create new arraylist when both the elements in the pair does not exist in the current arrylist
Now i think this is completely about my work i am doing. I hope you understand now

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:Now i think this is completely about my work i am doing. I hope you understand now

Much better. Now, look at what you just wrote and see how that compares with with your code.

I can tell you for starters: the two are NOT the same.

And this is why you should ALWAYS write out what you want to do in English first.

One thing I will tell you is that you need to check ALL the lists in your list of lists before you decide to create a new one.

Winston

priyanka kusuma
Greenhorn
Posts: 22
hey winston , that is what i am not getting as i posted the code earlier initially i am creating an arrylist and i am able to check that for all the elements but as you have seen my code whenever for loop is taking new pair of elements if it is not present in the array list then both the elemnts are added to the same array.
once you run my code and see then you will understand if this is done then 90% of my work is completed please help me to sort this problem and also i need the few lines of code to take two numbers as input from my text file and for that part this the code i have written :

FileInputStream file = new FileInputStream("E:/clusters.txt");
Scanner inputFile = new Scanner(file);
String line = inputFile.nextLine( );
final String[] array = line.split(" ");
and i am stuck here because in a while loop we can take one number at a time so please help me out here

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:String line = inputFile.nextLine( );
final String[] array = line.split(" ");
and i am stuck here because in a while loop we can take one number at a time so please help me out here

Why would you think that line.split(" ") is going to work if the input uses commas? (at least that's what you've shown us up to now)

And once you have your number String(s), how do you think you convert them to the values you want? I presume you're familiar with the java.lang.Integer and java.lang.Double APIs - and if you're not, you should read them.

You need to ShowSomeEffort priyanka. Nobody here is simply going to hand you code.

Winston

priyanka kusuma
Greenhorn
Posts: 22
Winston Gutkowski wrote:
priyanka kusuma wrote:String line = inputFile.nextLine( );
final String[] array = line.split(" ");
and i am stuck here because in a while loop we can take one number at a time so please help me out here

Why would you think that line.split(" ") is going to work if the input uses commas? (at least that's what you've shown us up to now)

And once you have your number String(s), how do you think you convert them to the values you want? I presume you're familiar with the java.lang.Integer and java.lang.Double APIs - and if you're not, you should read them.

You need to ShowSomeEffort priyanka. Nobody here is simply going to hand you code.

Winston

Hey it is because in my input file i dont have "," for you guys to understand i am using ",".
and i am not asking anything without my effort as i have told you earlier i am in the last stage of my project and i need this part and also i have written the code and what i want is you to correct the code where i have written wrong so that i can do it

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:Hey it is because in my input file i dont have "," for you guys to understand i am using ",".
and i am not asking anything without my effort as i have told you earlier i am in the last stage of my project and i need this part and also i have written the code and what i want is you to correct the code where i have written wrong so that i can do it

Well how are we supposed to know what's wrong if you don't show us exactly how your input is formatted? We're not mind-readers.

My suggestion: read the HowToAskQuestionsOnJavaRanch page, and also this one and this one. A bit of punctuation wouldn't go amiss either; your posts read like SMS messages.

Winston

priyanka kusuma
Greenhorn
Posts: 22
Hey I am sorry for my mistakes and thanks for your suggestions .Now I am asking you clearly could you please tell me how to read two numbers at a time as input from a text file which are separated by a space in each line so that i can compare them with my existing array list and i am having 1200 pairs

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:Hey I am sorry for my mistakes and thanks for your suggestions .Now I am asking you clearly could you please tell me how to read two numbers at a time as input from a text file which are separated by a space in each line so that i can compare them with my existing array list and i am having 1200 pairs

final String[] array = line.split(" ");
which looks reasonable to me, (based on what you've now told us - although line.trim().split("\\s+") is better).

So: how many Strings do you think array will contain when it finishes? Furthermore, why are you using a while loop to go through it? Surely you know the proper way to go through the elements of an array?
If you don't, this problem is beyond your current capabilities, and you'll need to read the tutorials (or whatever book you're studying) until you do.

After that, you'll need to convert each String into an integer (or whatever). Do you know how to do that?

You are NOT going to goad me into doing your work for you. Come up with a solution, test it, and come back if/when you have problems.

Winston

priyanka kusuma
Greenhorn
Posts: 22
Hey I am using while loop because it has to stop when it reaches the end of the file.I know about arrays.
int i = Integer.parseInt( s ); is the way to convert string in to int.I am not asking you to give the complete code ,if the code i have written has any mistakes then correct them.That's what i am asking you

Winston Gutkowski
Bartender
Posts: 10111
56
priyanka kusuma wrote:Hey I am using while loop because it has to stop when it reaches the end of the file...

If not, re-phrase your question properly (and you've been given plenty of links on how to do that).

But before that, read the Java tutorials and your class APIs; because if end-of-file IS what you're worried about, there ARE examples.

Winston

Brian Rupasinghe
Greenhorn
Posts: 7
Ernest Friedman-Hill wrote:
Originally posted by Salman Jamali:

Ernest, I'll try your code in a few hours, thanks a lot!

I've used this algorithm to find contiguous connected surfaces on a 3D polygonal model (my "pairs" are edges, and my "points" are faces.)

I am not very much clear about the algorithm, Could you please explain a bit?

Suppose I have number pairs, (5,6) (2,7) (3,7) (1,2) (3,4) (4,5), you mean adding
each element into an int array. Then sorting pairs in a separate ArrayList something in this order
first, (1,2) (2,7) (3,7) (3,4) (4,5) (5,6) and then in another ArrayList with higher number in
ascending order: (1,2) (3,4) (4,5) (5,6) (2,7) (4,7)??

Campbell Ritchie
Sheriff
Posts: 48453
56
Welcome to the Ranch

Even after three weeks, the original poster may not come back to give you an answer.