• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Rob Spoor
  • Devaka Cooray
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
  • Tim Holloway
Bartenders:
  • Jj Roberts
  • Al Hobbs
  • Piet Souris

Optimizing Java: Question about performance issues when using a lot of memory

 
Ranch Hand
Posts: 106
10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have an application that creates triangular meshes consisting of multiple millions of data points with a correspondingly large number of small Java objects (representing vertices and edges).  When the vertex count gets very large, and memory swells, I notice a sharp performance drop off somewhere about 2.5 gigabytes of memory.   Do you have any suggestions for better managing memory to improve throughput?

Here's a link to my project page if you'd like to get more background: Tinfour on Github

As you can imagine, getting the thing to process so much data with any kind of responsiveness required a lot of attention to performance.  One thing I've got to say is that I am absolutely amazed at just how fast Java can run.  Anyway, I am really looking forward to reading your book because I think that what you have to say is very relevant to my project in particular and a lot of other projects in general...
 
author
Posts: 14
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Gary,

Looks like a very interesting project! What kind of performance drop off are you seeing? Is it solely around the throughput?

Given the nature of your application it may well be that you are seeing the GC subsystem kicking in and stopping the world more often (and for longer) as the memory usage on the heap increases. This is a complete guess that being said . To find out what's going on you may wish to use a tool such as Visual VM to get an overview of what's going on. I would also recommend turning on GC Logging.

Once it comes to bigger heaps you also get options in terms of which garbage collector you use, which could be worth investigating if you find the cause of your problem to be GC related. There are also some nice tools out there that help with this, such as Censum - although not free there is a free trial.

I am also amazed by the speed of Java at times, I think the thing that has always got me is when I remember the worst Java applications I've ever seen they still run surprisingly well!

Jim
 
author
Posts: 58
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, slightly broken link in your post - I think it's https://github.com/gwlucastrig/Tinfour instead.

Interesting question - how big a heap size do you typically run with? Which old-generation collector are you using (if not explicitly set, it will be Parallel for 8 and G1 for 9+)?

Are you generating GC logs (see the book for details on which switches to turn on)? Have you analysed them yet?

Have you done a heap snapshot with VisualGC (or jmap -histo from the command line) to get a feel for which objects are eating up your heap?

In your code, have you checked to make sure that your code is eagerly null'ing references (which may increase the likelihood of objects getting swept up in young collections sooner) and taken a look for obvious opportunities to make code more friendly to Escape Analysis?

The "Friends of jClarity" mailing list that Martijn Verburg and I run is also a great place to come & discuss exactly these types of tuning issue...
 
Saloon Keeper
Posts: 24501
167
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Actually, the first place I look when I hear "lots of memory" is the OS performance monitor. You may have over-committed virtual memory and kicked into page-thrashing mode.

If the JVM isn't running into performance issues at the OS/hardware level and your app is creating/discarding objects at a high rate of speed, then certainly garbage collection is something to look at. Traditionally, gc issues would be apparent in the jerkiness of the program execution, since gc would stall processes. However modern JVMs are designed to make gc more of an ongoing thing than a wait-and-crunch operation.

If the garbage collection is killing you, though, one common solution is to keep caches of pre-created objects and recycle them rather than simply consigning them to the heap.
 
Ben Evans
author
Posts: 58
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Good catch - it's always a good idea to check for paging before looking at anything else.

If a modern server-class machine is swapping, then Job 1 is to make it stop - either by upgrading the machine or removing competing processes until it is no longer over-committed on physical memeory.

A machine that is actually swapping (aka paging to disk) is, for all practical purposes, impossible to tune. Getting rid of the swapping really has to come first, as it can severely distort the underlying behaviour of the application and cause phantom performance effects.

In terms of GC, the default old-gen collector for 8 and below is Parallel, which is a fully STW collector and can't be run in an incremental mode. Java 9 and above default to G1, which is an incremental collector - but if the allocation rate rises too much & it can't keep up will still fall back to a STW collection.

Personally, I would typically advise against using an object pool unless the application has zero-allocation requirements (sometimes seen in finance). They are difficult to get right, are prone to subtle corruption bugs and are rather inflexible, unless the overall working set size of the required objects is known in advance.
 
Tim Holloway
Saloon Keeper
Posts: 24501
167
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I once had to deal with an OS/2 system that was using sparse arrays via random access in an extremely over-committed way. By my estimate, this state-of-the-art (for its time) high end system ran slower than a Commodore 64 in BASIC. Systems that use large quantities of memory need to pay close attention to their working sets - the memory that is being accessed within a narrow time window. As long as you can keep the working set pages within your available RAM, you can do it, but once you exceed that limit, then everything goes off the rails. So you want to minimize the size of your working set. And where possible, keep as much set data on as few pages as possible. Unfortunately, Java is too abstract for optimizing by page.

Keeping a reserve object cache isn't that difficult - there are some very nice general-purpose pool classes in the Apache projects and they in fact underly such things as the Apache Database Connection Pool system, which was for a long time the default Connection Pool mechanism for Tomcat. These classes are fairly simple to use, can be tuned on the fly, and even contain debugging facilities in case you get sloppy and leak pool objects. This sort of stuff works best, however when you want to manage a homogeneous group of objects such as database connections or worker threads. If you have heterogeneous objects, you're better off with multiple pools, and the more pools you have to ride herd on, the messier it gets.

Just in case there was confusion, there's a difference between a pool and a hash at the application level. A pool may employ a hash (or a linked list, queue, array or other collection type that might be performant for the pool manager), but the pool itself is simply a factory to obtain objects to work with. You might then use a hash to track and locate the working objects (of which the working sets are subsets), but that's entirely different.
 
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic