aspose file tools*
The moose likes Threads and Synchronization and the fly likes Threads behaviour on a real cluster Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "Threads behaviour on a real cluster" Watch "Threads behaviour on a real cluster" New topic
Author

Threads behaviour on a real cluster

Sebastian B. Ecker
Greenhorn

Joined: Nov 22, 2006
Posts: 9
Hi guys,

I am wondering that java could handle several threads on a real cluster (LSF) in a way that it give every node in that cluster a thread as a job to do. Is that right? Or does such an effort fail and end in a behaviour like on a single local machine: all threads getting executed on one processor?

I have to work with a lot of data and therefore I need a way to trigger different tools and then wait for them till every single tool has finished.
Are there any suggestions how I can use a cluster together with java in a effective way.
Thank you in advance.
bas-t


Anyone who can't laugh at himself is not taking life seriously enough. - Larry Wall
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Hi, welcome to the ranch! I'd never heard of LSF before but a quick Google gives me a hint of what it does. When you ask it to spread a set of tasks across a cluster, does it start a process on each host machine in the cluster? To run Java I guess each one of those would be a new JVM on a new CPU. They'd compete with anything else running on the hosts, but not with each other. Does that seem to answer the right question?


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Sebastian B. Ecker
Greenhorn

Joined: Nov 22, 2006
Posts: 9
Thank you for your reply.
I think in this way that I submit a job which is an execution of a java programm. This programm has the task to execute e.g 10 times the same perl programm in 10 threads (but with different files as parameter) and exits after all.
My question is now: does every single thread would be placed on a free node in the cluster or does all happens on one node? If the java programm would execute all tools on one node, then I have to think about reconfiger it all, in the manner that the tools are getting executed on different nodes (for speed increasement, of course).

Hope you understand what I am meaning.:-)
bas-T
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24184
    
  34

This would really depend on the hypothetical implementation details of this platform we're talking about. There are Java implementations of, e.g., MPI, for cluster computing. But these questions can't be answered without reference to some specific implementation.

The standard JVM is a program that runs on one node -- like most programs do.


[Jess in Action][AskingGoodQuestions]
Ben Ethridge
Ranch Hand

Joined: Jul 28, 2003
Posts: 108
Hi, Sebastian.

Assuming you mean what I think you mean, do you really need your app to be multi-node (i.e. a cluster)? ...or would multi-cpu (on a single "node") perform for you just as well:

http://www.coderanch.com/t/233607/threads/java/Multi-threading

If not, can you tell us a bit about what you mean by "node" and "cluster", (Do you mean something like a WebSphere node?)...and why you think you need a cluster of nodes as opposed to a "cluster" of cpus?

Ben
Sebastian B. Ecker
Greenhorn

Joined: Nov 22, 2006
Posts: 9
Hi Ben,

thank you for your reply and your link. I am thinking about running my java program on a multi computer cluster. One computer is "a node" of it. I am using "bsub" for submitting jobs in a queue. I am wondering if a thread usage can optimize a run of my program (which is executing in several cycles different external tools) on this kind of submission. But probably all the threads were executed on one single computer because the VM couldn't handle it to distribute the execution orders to other computers in the cluster. I don't know how this is working and if RMI would be the best choice.
Do I have to install a special kind of VM?
Do I have answered your question well?
Regards
Sebastian
Ben Ethridge
Ranch Hand

Joined: Jul 28, 2003
Posts: 108
Yes, your questions/answers are clear enough, but I personally have no experience with multi-node (multi-computer) JVM, as far as multi-node AND multi-thread.

Perhaps someone else on this forum?

RMI sounds like it would also solve the problem, but it comes with its own set of baggage (learning curve, advantages/disadvantages).

However, based on what you say, I don't see why the multi-cpu (single computer) would not handle your concurrent threads as you desire. Am I missing something on this? Are the servers in distant locations? Why the need for the multi-computer? And if you go multi-computer, why do you then also need multi-thread? Why not just one thread per computer (per JVM)?

Kind of to Ernest's point above, we would need more details to help with a good solution.


Ben
Chris Hurst
Ranch Hand

Joined: Oct 26, 2003
Posts: 416
    
    2

The quick answer is it sounds like you've bought a system that should do the work for you (only read a paragraph description). Normally when tackling a problem like you describe you buy in some server software and it does the work of dolling out work to processes , creating threads e.g. a J2EE (I know this isn't, but equally applies to the other flavours of sever / controller software) solution might read from a queue doll (MQ/JMS) out work to distributed servers as EJB's that could each run in their own threads. Basically if they've done the job correctly you shouldn't have to worry about threads and RPC \ RMI calls etc etc. That's why I sometimes find some J2EE programmers have weaker threading skills because they don't have to know threading (which is a good thing). Normally some form of sever software dolls out thread work to distributed JVM's for you, you could write it and it would be a very interesting job but no one would thank you for it.

Its possible their might be some advantage in multi threading at each node and using RPC \ RMI type calls between nodes to load balance but this should have been done for you i.e. you'ed be reinventing the wheel. I'ed expect them to do this job for you and if not be talking to other vendors, I like writing multithreaded programs but this problem looks like you need a server/ framework that does the threads for you and I suspect you may have one, if not they are out there.


"Eagles may soar but weasels don't get sucked into jet engines" SCJP 1.6, SCWCD 1.4, SCJD 1.5,SCBCD 5
Ben Ethridge
Ranch Hand

Joined: Jul 28, 2003
Posts: 108
Good points, Chris.

I agree that J2EE programmers tend to have less threading skills (myself included, but I am getting better on the threading now that I'm not relying on J2EE for everything).

Sebastian, why wouldn't J2EE solve your problem, kind of to Chris's point?

Ben
Sebastian B. Ecker
Greenhorn

Joined: Nov 22, 2006
Posts: 9
Thanks all,

I solved my problem. I just wrote a small script which works as a Workflow management system. It triggers one workflow. This workflow includes several tools which are executed one after another. In this way I am able to start hundrets of workflows on the LSF (Load Sharing Facility) cluster without caring about the proper distribution of the single workloads to the different cluster nodes. One executing of a tool on one node. And I thought before it is possible to share ONE execution on several nodes (which is not possible). Ok, due to you all I understood the working process of the LSF software (or principle).

See you around
Sebe
 
wood burning stoves
 
subject: Threads behaviour on a real cluster