• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Threads behaviour on a real cluster

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi guys,

I am wondering that java could handle several threads on a real cluster (LSF) in a way that it give every node in that cluster a thread as a job to do. Is that right? Or does such an effort fail and end in a behaviour like on a single local machine: all threads getting executed on one processor?

I have to work with a lot of data and therefore I need a way to trigger different tools and then wait for them till every single tool has finished.
Are there any suggestions how I can use a cluster together with java in a effective way.
Thank you in advance.
bas-t
 
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, welcome to the ranch! I'd never heard of LSF before but a quick Google gives me a hint of what it does. When you ask it to spread a set of tasks across a cluster, does it start a process on each host machine in the cluster? To run Java I guess each one of those would be a new JVM on a new CPU. They'd compete with anything else running on the hosts, but not with each other. Does that seem to answer the right question?
 
Sebastian B. Ecker
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you for your reply.
I think in this way that I submit a job which is an execution of a java programm. This programm has the task to execute e.g 10 times the same perl programm in 10 threads (but with different files as parameter) and exits after all.
My question is now: does every single thread would be placed on a free node in the cluster or does all happens on one node? If the java programm would execute all tools on one node, then I have to think about reconfiger it all, in the manner that the tools are getting executed on different nodes (for speed increasement, of course).

Hope you understand what I am meaning.:-)
bas-T
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This would really depend on the hypothetical implementation details of this platform we're talking about. There are Java implementations of, e.g., MPI, for cluster computing. But these questions can't be answered without reference to some specific implementation.

The standard JVM is a program that runs on one node -- like most programs do.
 
Ranch Hand
Posts: 108
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, Sebastian.

Assuming you mean what I think you mean, do you really need your app to be multi-node (i.e. a cluster)? ...or would multi-cpu (on a single "node") perform for you just as well:

https://coderanch.com/t/233607/threads/java/Multi-threading

If not, can you tell us a bit about what you mean by "node" and "cluster", (Do you mean something like a WebSphere node?)...and why you think you need a cluster of nodes as opposed to a "cluster" of cpus?

Ben
 
Sebastian B. Ecker
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Ben,

thank you for your reply and your link. I am thinking about running my java program on a multi computer cluster. One computer is "a node" of it. I am using "bsub" for submitting jobs in a queue. I am wondering if a thread usage can optimize a run of my program (which is executing in several cycles different external tools) on this kind of submission. But probably all the threads were executed on one single computer because the VM couldn't handle it to distribute the execution orders to other computers in the cluster. I don't know how this is working and if RMI would be the best choice.
Do I have to install a special kind of VM?
Do I have answered your question well?
Regards
Sebastian
 
Ben Ethridge
Ranch Hand
Posts: 108
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, your questions/answers are clear enough, but I personally have no experience with multi-node (multi-computer) JVM, as far as multi-node AND multi-thread.

Perhaps someone else on this forum?

RMI sounds like it would also solve the problem, but it comes with its own set of baggage (learning curve, advantages/disadvantages).

However, based on what you say, I don't see why the multi-cpu (single computer) would not handle your concurrent threads as you desire. Am I missing something on this? Are the servers in distant locations? Why the need for the multi-computer? And if you go multi-computer, why do you then also need multi-thread? Why not just one thread per computer (per JVM)?

Kind of to Ernest's point above, we would need more details to help with a good solution.


Ben
 
Ranch Hand
Posts: 443
3
Eclipse IDE C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The quick answer is it sounds like you've bought a system that should do the work for you (only read a paragraph description). Normally when tackling a problem like you describe you buy in some server software and it does the work of dolling out work to processes , creating threads e.g. a J2EE (I know this isn't, but equally applies to the other flavours of sever / controller software) solution might read from a queue doll (MQ/JMS) out work to distributed servers as EJB's that could each run in their own threads. Basically if they've done the job correctly you shouldn't have to worry about threads and RPC \ RMI calls etc etc. That's why I sometimes find some J2EE programmers have weaker threading skills because they don't have to know threading (which is a good thing). Normally some form of sever software dolls out thread work to distributed JVM's for you, you could write it and it would be a very interesting job but no one would thank you for it.

Its possible their might be some advantage in multi threading at each node and using RPC \ RMI type calls between nodes to load balance but this should have been done for you i.e. you'ed be reinventing the wheel. I'ed expect them to do this job for you and if not be talking to other vendors, I like writing multithreaded programs but this problem looks like you need a server/ framework that does the threads for you and I suspect you may have one, if not they are out there.
 
Ben Ethridge
Ranch Hand
Posts: 108
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Good points, Chris.

I agree that J2EE programmers tend to have less threading skills (myself included, but I am getting better on the threading now that I'm not relying on J2EE for everything).

Sebastian, why wouldn't J2EE solve your problem, kind of to Chris's point?

Ben
 
Sebastian B. Ecker
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks all,

I solved my problem. I just wrote a small script which works as a Workflow management system. It triggers one workflow. This workflow includes several tools which are executed one after another. In this way I am able to start hundrets of workflows on the LSF (Load Sharing Facility) cluster without caring about the proper distribution of the single workloads to the different cluster nodes. One executing of a tool on one node. And I thought before it is possible to share ONE execution on several nodes (which is not possible). Ok, due to you all I understood the working process of the LSF software (or principle).

See you around
Sebe
 
Farmers know to never drive a tractor near a honey locust tree. But a tiny ad is okay:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic