wood burning stoves 2.0*
The moose likes Threads and Synchronization and the fly likes Java in the world of Parallel Programming Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "Java in the world of Parallel Programming" Watch "Java in the world of Parallel Programming" New topic
Author

Java in the world of Parallel Programming

Vijitha Kumara
Bartender

Joined: Mar 24, 2008
Posts: 3827

Hi Sergey,

What are the limitations/drawbacks (if any) you see in Java (Concurrency package and any thread related classes) when it comes to Parallel programming? Since it's a higher level API, can the application developers *really* utilize the power of multiple cores as with C++?

Thanks,


SCJP 5 | SCWCD 5
[How to ask questions] [Twitter]
Sergey Babkin
author
Ranch Hand

Joined: Apr 05, 2010
Posts: 50
Sure, why not? The creation of the threads might be more expensive, but once they start running, they get just as much benefit. Java has some very developed data structures to make th ethread usage easier. A downside might be that making your own custom structures moight not be that easy, becasue they might be slower than the built-in ones written with some native code.
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4650
    
    5

Not one to argue a lot with our author, but I think (all IMHO) both Java and C++ are way too complicated to use for parallel applications by some huge percentage of commercial professional developers.

The efficiency argument has been used for every step forward since Von Neuman started programming. Are assembly languages too far from the hardware? Can you right a real operating system in a higher level language (as defined at the time) or must you use machine code/assembly.

The "high level languages" of the time were Bliss, C, and PL/1. While PL/1 was used for Multics, which brought many critical concepts into the world, Multics was considered to big and too resource intensive (speed, memory, etc.) to be useful in the real world. Bliss and C were used in their time-frames for DEC's VMS and AT&T's Unix. Both of these systems were extremely successful.

I have been using Java for 12 or 13 years, and bought Henry's threads book first edition. While the newer libraries, and better books and documentation, its more approachable, but its not easy.

Which is why my biases say that a language like Scala, that is inherently parallel, built on the JVM engine and libraries, will be the future. I have no idea if Scala will be the one, but I think that long term, we will look back at programming applications in Java for 64 processor CPUs and think "how quaint" as we do about programming operating systems in assembly.


Sergey Babkin
author
Ranch Hand

Joined: Apr 05, 2010
Posts: 50
Pat Farrell wrote:Not one to argue a lot with our author, but I think (all IMHO) both Java and C++ are way too complicated to use for parallel applications by some huge percentage of commercial professional developers.


Why not argue? All the forum fun is in the arguing! :-)


The efficiency argument has been used for every step forward since Von Neuman started programming. Are assembly languages too far from the hardware? Can you right a real operating system in a higher level language (as defined at the time) or must you use machine code/assembly.

The "high level languages" of the time were Bliss, C, and PL/1. While PL/1 was used for Multics, which brought many critical concepts into the world, Multics was considered to big and too resource intensive (speed, memory, etc.) to be useful in the real world. Bliss and C were used in their time-frames for DEC's VMS and AT&T's Unix. Both of these systems were extremely successful.

I have been using Java for 12 or 13 years, and bought Henry's threads book first edition. While the newer libraries, and better books and documentation, its more approachable, but its not easy.

Which is why my biases say that a language like Scala, that is inherently parallel, built on the JVM engine and libraries, will be the future. I have no idea if Scala will be the one, but I think that long term, we will look back at programming applications in Java for 64 processor CPUs and think "how quaint" as we do about programming operating systems in assembly.


I haven't looked at Scala, so I can't say for sure. Isn't it something from the department of the functional programming? Personally I find the functional programming very difficult. The logic of the functional programs is very hard for me to track: they're hard to read, hard to write, and horribly difficult to debug. Worse yet, when I read the materials written by the people who do like the functional programming, I get the feeling that it's just as hard for them too, only they don't know any better. It looks like they're not tryng to get the logic of the program right, instead they keep juggling operators until the result becomes somewhat close to the right one, simply ignoring the things that go "slightly wrong".

I've also found the systems based on the message passing the hardest ones to write, read and debug. Beats even the funcitonal programming by a wide margin. They're sort of OK for a small number of threads with limited functionality but even those are difficult to get working right. Probably the most glaring example of this I've seen is the Reliant HA clustering software. There really aren't many communicating objects in it, and the communications are pretty simple, but still everything goes haywired. The result is a hugely buggy piece of software. I've included some examples of how things become difficult into my book, the chapter on the other ways of synchronization. The only exception is for the simple topologies without loops (and without paying too much attention to the fork/join problems). Without loops the message passing works great. It's available for example in the Complex Event Processing engines.
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4650
    
    5

Sergey Babkin wrote:Why not argue? All the forum fun is in the arguing! :-)


We try hard to be very friendly here on the Ranch, so, OK, I'll discuss it a bit, but only if you are nice :-)

Sergey Babkin wrote:I haven't looked at Scala, so I can't say for sure. Isn't it something from the department of the functional programming? Personally I find the functional programming very difficult. The logic of the functional programs is very hard for me to track: they're hard to read, hard to write, and horribly difficult to debug.


Its more functional that most, in that flavor of programming.

But the reality is that solid parallel programming is hard to (read/write/debug) no matter how you do it. Its just a complex topic, and as I'm sure your book points out, sloppy code that works fine single threaded or at modest levels of parallelism breaks in really ugly and subtle ways in massively parallel world. Which is why Dijkstra, Hoare, et all spent so much of the 60s building a solid set of tools and theory.

I'm not too worried about the next year or two. There are six core CPUs for sale for $300 now, I don't expect we will exceed 16 for a while at least one year. But its inevitable that the numbers will get huge. And I don't think any sequential programming language is going to work.

Scala is pretty foreign looking. They didn't fall into the same trap that the C folks did going to C++, where they tried to hard to go from traditional systems language to OO, and ended up with something that was not very OO and not very traditional.

The strength of Scala is to remove the sequential steps, you make the algorithm and apply it to a wad of data. The whole functional space reminds me a lot of Iverson's APL (A programming language) of the late 60s/early 70s. That was simply too much of a leap at the time, but with it, you could say "take this matrix M and invert it, take the eigenvectors and do this with them", it works at a much higher level of abstraction, just as well written Java handles internationalization in ways that a C programmer with strcpy simply can never do.


As a transition device, all of the Java libraries, any Java library works with Scala, but I see this as temporary and expedient, not fundamental. What is fundamental is to build on the JVM and its performance, and then bring folks to think at a higher level. When I was a young programmer, it was way before the whole WIMP (windows, icons, mouse and pointers) idea was invented, we did command line services and thought that was very user friendly.

The whole WIMP world was considered to slow and inefficient to use for real work well into the 1990s for PCs, While windows 3.0 allowed folks to think about windows, corporate America was using DOS and Wordperfect well past Windows 95.

I don't think that writing in functional languages is going to be any harder of a transition that it was going to the classic C code for Windows in the 1988 to 92 time frame. Visual C++ barely made it possible and that was 93 or so. But programmers are smarter than the average bear, and we'll adapt.

Scala really wants you to "think different" and really need a less procedural approach to it all, from HTML posts to sevlets and jsps. Its very new, but one framework that is catching on (i.e. foursquare) in massive systems is Lift web framework

I don't know if Scala is the answer, but I do know that I sure hope to retire before Java++ darkens our windows.

New technology needs better tools. Programmers no longer worry about branch delay slots, we should be addressing the domain issues, not the plumbing.

All IMHO, you may disagree with some or 100% of this.

Pat

Richard Golebiowski
Ranch Hand

Joined: May 05, 2010
Posts: 213

But the reality is that solid parallel programming is hard to (read/write/debug) no matter how you do it. Its just a complex topic, and as I'm sure your book points out, sloppy code that works fine single threaded or at modest levels of parallelism breaks in really ugly and subtle ways in massively parallel world. Which is why Dijkstra, Hoare, et all spent so much of the 60s building a solid set of tools and theory.



I don't get why people say it's difficult. I did a very complicated web app that has to do thousands of calculations, reading, calculating, and then writing data back out to a database without any problems.
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4650
    
    5

Richard Golebiowski wrote:I don't get why people say it's difficult. I did a very complicated web app that has to do thousands of calculations, reading, calculating, and then writing data back out to a database without any problems.


What is the word that "it's difficult" is referring to, specifically.

Complicated sequential programming is what we do as programmers. Making sequential algorithms work properly and efficiently in parallel implementations is hard. Specifically, you can not have any (or much, or a lot...) state that is kept in sync between multiple threads, or you spend all your time checking and passing the state and none doing your thousands of calculations.

The normal client/webserver/application/RDBMS is actually four steps that can do parts of their work in parallel.
  • The client sends a request
  • the webserver delivers the images, CSS, etc. and dispatches to the application server (tomcat, jboss, etc)
  • Tomcat does the business logic, and calls out to SQL
  • The RDBMS does the SQL and sends it back to the domain code
  • The business logic uses that data, calculates some, and either returns the result (jsp/servlet response) to Tomcat and then Tomcat to the Webserver
    or, and this happens a lot, asks for more DBMS stuff
  • The webserver does the final merge of the output
  • The client's browser displays all the nice results


  • This is all essentially sequential. Behind the scenes, some stuff it happening in parallel, but the programmer does not see it.

    Think about problems more generally, PC video games are a good example. To render a frame, you have to understand 3D and figure out what parts of the image are hidden because they are behind things in the foreground. For a 1240x1024 images, this is over a million pixels, each with 24 bits of color. Just a decade ago, this was about all a PC could do. And it could not do it in real time at realistic rates. The breakthrough was to realize that all of the image does not have to be done at once, you can chop up the screen into say 128 pieces and have 128 processors do a part at a time in parallel. Modern gamer-oriented CPUs are amazing, they have many hundreds of vertex shader, hidden line calculators, etc. But the algorithms to do that are fairly simple and have been worked on for 30+ years.

    The challenge is when the business logic is complex. The simple techniques that work so well in GPU systems can't work.

    Pretend you are implementing Facebook. You want to report who are the users who have most "friends". The naive appoach is to write a bit of code that looks like:



    Works great in testing. Fails to work so well when you have 500 million users.

    The only solution is to chop it up and do it on thousands of computers at once. But then how does Core 42 know that Core 52893 has changed the value of the ninth object/structore? But telling all the other thousands of computers every time it changes? Not a chance.

    Richard Golebiowski
    Ranch Hand

    Joined: May 05, 2010
    Posts: 213

    Pretend you are implementing Facebook. You want to report who are the users who have most "friends". The naive appoach is to write a bit of code that looks like:

    Find all users
    Setup array of ten objects/structure with user information, and number of friends
    For all users,
    if this user's friend.size() > existing
    put this user in list, bump someone else out
    Find all users
    Setup array of ten objects/structure with user information, and number of friends
    For all users,
    if this user's friend.size() > existing
    put this user in list, bump someone else out


    Or, it could just be something along the lines of:




    But I do agree that things could get tricky if you are doing a large application distributed among multiple systems. And I agree that this type of system will be difficult to troubleshoot.

    However, most applications are stand alone apps that run on a single system. With a stand alone application things are simpler and so is using multiple threads to do calculations.
    Pat Farrell
    Rancher

    Joined: Aug 11, 2007
    Posts: 4650
        
        5

    Richard Golebiowski wrote: select top 10 name from user order by numberOfFreinds desc

    However, most applications are stand alone apps that run on a single system. With a stand alone application things are simpler and so is using multiple threads to do calculations.

    Your example is simply pushing the sequential problem off to the DBMS engine. It won't scale to a Facebook size. It won't scale to any real commercial size.

    I've been professionally writing Java for 13+ years and have never written a stand alone app for a single system. Everything has been web implementations for large scale commercial websites. I've never been paid to write code using things like AWT or Swing
    Richard Golebiowski
    Ranch Hand

    Joined: May 05, 2010
    Posts: 213

    I've been professionally writing Java for 13+ years and have never written a stand alone app for a single system. Everything has been web implementations for large scale commercial websites. I've never been paid to write code using things like AWT or Swing


    I've been programming for 23+ years and have found that there ar other things on the web besides large scale commercial sites.
    Pat Farrell
    Rancher

    Joined: Aug 11, 2007
    Posts: 4650
        
        5

    Richard Golebiowski wrote: there ar other things on the web besides large scale commercial sites.

    You are going way OT here. "On the web" usually does not mean "stand alone app"
    Richard Golebiowski
    Ranch Hand

    Joined: May 05, 2010
    Posts: 213

    My point is that not everyone is going to be programming for Faacebook or Google or Linkedin. Some people are devloping smaller applications that can possibly benefit from multi-threading and that in these instances doing something that is multi-threaded probably wouldn't be hard to do.
    Pat Farrell
    Rancher

    Joined: Aug 11, 2007
    Posts: 4650
        
        5

    Richard Golebiowski wrote:smaller applications that can possibly benefit from multi-threading and that in these instances doing something that is multi-threaded probably wouldn't be hard to do.


    All modern web-based Java applications use a Sevlet container to provide multi-threading. Its easy to do. It takes only a tiny bit of effort to make all your servlets/beans be thread safe.

    However, this thread is about "Parallel programming" which often means things more complex than just letting Jboss/Glassfish handle thread dispatching and talking to the DBMS. I don't think you (Richard) and I are helping anyone by continuing this OT discussion
    Richard Golebiowski
    Ranch Hand

    Joined: May 05, 2010
    Posts: 213

    I agree.
    Sergey Babkin
    author
    Ranch Hand

    Joined: Apr 05, 2010
    Posts: 50
    Pat Farrell wrote:
    But the reality is that solid parallel programming is hard to (read/write/debug) no matter how you do it. Its just a complex topic, and as I'm sure your book points out, sloppy code that works fine single threaded or at modest levels of parallelism breaks in really ugly and subtle ways in massively parallel world. Which is why Dijkstra, Hoare, et all spent so much of the 60s building a solid set of tools and theory.


    Dijkstra, Hoare et all didn't have a massive parallelism, it wasn't affordable then :-)


    I'm not too worried about the next year or two. There are six core CPUs for sale for $300 now, I don't expect we will exceed 16 for a while at least one year. But its inevitable that the numbers will get huge. And I don't think any sequential programming language is going to work.


    There have been massively parallel systems invented before. The "transputers" are a good example, even with the special language Occam. However as I understand, programming them is far from easy.


    Scala is pretty foreign looking. They didn't fall into the same trap that the C folks did going to C++, where they tried to hard to go from traditional systems language to OO, and ended up with something that was not very OO and not very traditional.


    The beauty of C++ is actually that you get both, the OO and the traditional look and high efficiency.


    The strength of Scala is to remove the sequential steps, you make the algorithm and apply it to a wad of data. The whole functional space reminds me a lot of Iverson's APL (A programming language) of the late 60s/early 70s. That was simply too much of a leap at the time, but with it, you could say "take this matrix M and invert it, take the eigenvectors and do this with them", it works at a much higher level of abstraction, just as well written Java handles internationalization in ways that a C programmer with strcpy simply can never do.


    Interesting, APL is also what comes to my mind when I look at the functional programming. Only I think that APL is pretty much the most nightmarish language possible.


    As a transition device, all of the Java libraries, any Java library works with Scala, but I see this as temporary and expedient, not fundamental. What is fundamental is to build on the JVM and its performance, and then bring folks to think at a higher level. When I was a young programmer, it was way before the whole WIMP (windows, icons, mouse and pointers) idea was invented, we did command line services and thought that was very user friendly.

    The whole WIMP world was considered to slow and inefficient to use for real work well into the 1990s for PCs, While windows 3.0 allowed folks to think about windows, corporate America was using DOS and Wordperfect well past Windows 95.


    I still think that the MS-DOS programs (not command-line but full-screen ones, again, not Wordperfect nor MS Word, but say Norton Commander, Norton Utilities, Multi-edit) were the pinnacle of user-firendliness. The Windows stuff has been a huge step back, making the interface less modal, more difficult to use, and impossible to use without a mouse.
    Pat Farrell
    Rancher

    Joined: Aug 11, 2007
    Posts: 4650
        
        5

    Sergey Babkin wrote:Interesting, APL is also what comes to my mind when I look at the functional programming. Only I think that APL is pretty much the most nightmarish language possible.


    I agree that APL was a disaster. They went for too much terseness, and it became Kleenex code, write once, throw away. But it came from a time before "GO TO considered harmful" and a lot has improved since then.

    The transition will be interesting. I loved writing smalltalk (mostly) Smalltalk, but it was too different from the tools of the 80s. Java stayed very close to C to make the transition easier, and it helped make it popular, but I have always wished it was more OO and less C-like.

    I find that I am a poor judge of languages at first exposure. I've used probably 30 over the years, some just slightly difference, same Fortran IV to Fortran 77, and others radical, Cobol to Perl. I have to write a non-trivial piece of code, which is always painful, before I get with the new style.
     
    jQuery in Action, 2nd edition
     
    subject: Java in the world of Parallel Programming