This week's book giveaway is in the Mac OS forum.
We're giving away four copies of a choice of "Take Control of Upgrading to Yosemite" or "Take Control of Automating Your Mac" and have Joe Kissell on-line!
See this thread for details.
The moose likes Servlets and the fly likes Balancing good design versus performance & memory Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Servlets
Bookmark "Balancing good design versus performance & memory " Watch "Balancing good design versus performance & memory " New topic
Author

Balancing good design versus performance & memory

Alok Pota
Ranch Hand

Joined: Mar 07, 2001
Posts: 185
A good MVC design mandates that the M (bean), V (JSP) and C (Servlet) be as seperate as possible. A suggestion that our team-lead has asked us to live by is to not have any SQL/DB related objects in the V. The V is strictly for V (whatever that means). This strict partioning although elegant and very *OO* comes with a terrible cost. We need to transfer the data in the M to V and to do that we need an object (HashMap, List, TernaryTree, whatever).
What should the scope of this object be?
A request scope would mean creating and *nulling* this object everytime I do a "V". Creating many short lived objects, something folks in the performance forum would possibly chide you about.
A session scope makes the objects *not so short lived*. I will be holding up the object's in memory for the time the user is logged in. It could be 7 hrs or 7 seconds. What if all my concurrent users love my application and are logged on for hours on end. Ain't I holding up all that memory? What if more fans log in?

A page scope (I think) translates into defining instance variables on the servlet. Good if your data is read only. You will have to synchronize any changes you do that object. Once again we are holding up memory (I guess)
An application scope is too broad and I am not sure how it would pan out in a web cluster?. Wouldn't the application/servlet contexts be different for the different web servers?
Of the different scopes, session scope is I think the only one where a container has callback methods it calls on the bound object. (valueBound,valueUnBound).
I would like input from people who have faced this dilemma before?

Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Originally posted by Alok Pota:
A good MVC design mandates that the M (bean), V (JSP) and C (Servlet) be as seperate as possible. A suggestion that our team-lead has asked us to live by is to not have any SQL/DB related objects in the V. The V is strictly for V (whatever that means). This strict partioning although elegant and very *OO* comes with a terrible cost. We need to transfer the data in the M to V and to do that we need an object (HashMap, List, TernaryTree, whatever).

Not necessarily. For example, the Model could expose data in the form of Iterators which are no more than wrappers around JDBC ResultSets.
What should the scope of this object be?

The actual scoping of the Model's data is, I think, a model implementation detail which should be abstracted away from the JSP.
Session-scoped objects are very suitable to give access to the model. There are roughly three ways these objects can implement the data access:

  • Grab a database connection from the pool and access the database when the Model state is accessed. The data is exposed to the JSP as an iterator or a collection. The lifetime of the objects you create would effectively be that of the page/service call. This is most useful for data which is hard to cache or for which you expect only low volume.
  • Keep data in the Model object itself. This is most useful for user-specific information you need rapid access to (the canonic example is login information or a shopping cart). The data would be session-scoped.
  • Delegate the data access to application-scoped objects. The model would act as a wrapper around these. This is most useful for cached global data that can be shared among all users.

  • A typical model would implement a mix of all three access types. The JSP should never know or care how parts of the model are actually implemented.
    A request scope would mean creating and *nulling* this object everytime I do a "V". Creating many short lived objects, something folks in the performance forum would possibly chide you about.

    With the generational garbage collection in today's JVMs it is no longer as expensive as it was. Still, it something you'd want to avoid if possible. From the above it is hopefully clear that a session-scoped model is free to (effectively) implement request- or page-scoped data access without creating overly many short-lived objects (just a ResultSet, an iterator, and probably a handful of Strings).
    A session scope makes the objects *not so short lived*. I will be holding up the object's in memory for the time the user is logged in.

    True. Not only that, if you use a clustered container with session fail-over, all your session-bound objects will be replicated.
    Be aware though that it's not the objects which take up the space, it's the data in those objects. You can implement a huge and complicated session-scoped model, it won't affect scalability as long as it contains no more data than necessary.
    A page scope (I think) translates into defining instance variables on the servlet.

    No -- page scoped objects are even shorter-lived than request scoped objects. They are released as soon as you leave the page.
    An application scope is too broad and I am not sure how it would pan out in a web cluster?. Wouldn't the application/servlet contexts be different for the different web servers?

    Yes, in a cluster you have one application scope per JVM, it would not be truly global. Still application scope is excellent for caching etc. If you run into consistency issues, you either cannot cache or you should look into a notification mechanism or a good EJB container or similar.
    Of the different scopes, session scope is I think the only one where a container has callback methods it calls on the bound object. (valueBound,valueUnBound).

    The session scope is the only one where you really need these methods, I think...
    Hope this helped,
    - Peter
Alok Pota
Ranch Hand

Joined: Mar 07, 2001
Posts: 185
Thanks Peter.
The particular scenario I am talking about is an HTML tree. When I click a node I signal the tree to expand/collapse and the entire tree is redisplayed with all the expanded/collapsed nodes. The nature of the problem is such that the the number of nodes for my tree is not trivial (~200 nodes per level, 3 levels deep). Putting the tree model in the request scope would not be optimal. (Any opinions?). I have come out with a compromise, such that I have stuff in the session scope and I use a timer to clear the session. This is ugly but atleast I have control over the *nullying* process. I wonder if the Runtime.getRuntime.gc() even gets called when a request scoped bean goes out of scope!.
Maybe I am too worried about optimization at this stage when I should be concerned about design. But I can see this thing being a performance bottleneck once it goes production.

Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Originally posted by Alok Pota:
The particular scenario I am talking about is an HTML tree. When I click a node I signal the tree to expand/collapse and the entire tree is redisplayed with all the expanded/collapsed nodes.

The tree is global data? If yes:
There are actually a number of components to your problem.

  • The tree model encapsulating the tree data. Assuming this is shared, this is best stored in an application scoped object (Model A).
  • The tree state encapsulating the state (expanded/collapsed nodes) of the tree for a particular user. This should be encapsulated in a session bound model (Model B).
  • The tree view which is the actual display (JSP or whatever) (View).

  • Elsewhere, you state that you need intermediate objects to communicate from the Model to the View. I disagree. One possible way to implement these three components without recourse to giant intermediate objects is this.
    Your session-scoped Model B would have a factory method for an iterator. This iterator would walk through the visible nodes of the tree (Model B), giving access to the state of and data behind each node (Model A). For example:

    The hasNextPeer method will tell you whether there are more peer nodes to come, so you will know how long ton continue drawing a vertical line. The TreeNode objects returned should give access to the node type, its state, and whatever data you need. (I'm thinking this up as I go, so there may be gotchas and/or more elegant solutions).
    Let me emphasise that the TreeNode objects can be very lightweight. Methods that retrieve state could directly access the state (Model B). Methods that retrieve data will directly delegate to the tree data Model (A).
    The nature of the problem is such that the the number of nodes for my tree is not trivial (~200 nodes per level, 3 levels deep). Putting the tree model in the request scope would not be optimal. (Any opinions?).

    Not optimal and not necessary.
    I have come out with a compromise, such that I have stuff in the session scope and I use a timer to clear the session. This is ugly but atleast I have control over the *nullying* process. I wonder if the Runtime.getRuntime.gc() even gets called when a request scoped bean goes out of scope!.

    That is ugly indeed. If the bean goes out of scope gc() probably does not get called. Nor does it need to get called. Nor is there any guarantee that garbage collection would run even if it were called. Is it important?
    Maybe I am too worried about optimization at this stage when I should be concerned about design. But I can see this thing being a performance bottleneck once it goes production.

    At this stage it is extremely important to come up with an architecture which is well factored and loosely coupled. Not because you should attack real or imagined performance issues at this point, but because you want to be able to attack such issues at a later stage with minimum impact. This is not only important for performance issues alone, but also functionality changes, maintenance, and code re-use.
    HTH
    - Peter
Alok Pota
Ranch Hand

Joined: Mar 07, 2001
Posts: 185
Interestingly, the model you described is exactly what I have,
The tree data (TreeNodes and parent-child realtions) being global, I store it in the application context (Model A), The list of visible nodes and which nodes are open/closed is stored in the session (Model B) and Model B exposes iterators for the JSP (View) to traverse over.
The problem is that the data in Model A is huge and storing it in any global context (JSP application/servletcontext/JNDI) takes up and hangs on to some chunk of memory for the lifeof the server. It does however offer one big advantage and that is I don't have to replicate that data per user. All users share the Model A data. The GC gets a break too as shared Model A data does not get created per user.
Model B carries lightweight data which gets created per user. Model B data *hopefully* gets GC'd when the user session is invalidated, through an attempt to null Model B data in the
valueUnbound(HttpSessionBindingEvent method)
It seems from above that Model A data never gets a chance to be cleared and any attempt to clear it would have to be synchronized over the views for several users.
I tried another approach, and that is to replicate Model A data per user (in the session) and then use java.util.Timer utility to clear Model A & B data at intervals in hopes that the GC will take some action. From what it seems that Runtime.getRuntime().gc() is a request and not a command and so there is no guarantee that all that replicated Model A & Model B data gets GC'd. Whats worse is that users who close their browsers and log back in multiple times end up having replicated copies not cleared until the server kicks off their session (hence the Timer thread that clears session data at intervals less than *session tiemout*).
Any ideas would be of great help? because with HptSpot Client 1.3 VM on Windows 2000 with the Resin App Server I am able to sail smoothly but on Linux with the same JVM and app server, I get the dreaded *OutOfMemoryError*, at this point I am not sure if its my design or the Linux's version of HotSpot. The same code on two different OS's, one works fine, the other gives an OutOfMemoryError.
Any suggestions would be of great help.
Alok Pota
Ranch Hand

Joined: Mar 07, 2001
Posts: 185
Sun's HotSpot FAQ, seems to suggest that the more short lived objects you have the better it is at optimizng stuff. Does that mean the old fear of creating too many short lived objects is uncalled for if you are using HotSpot? In light of my tree, both Model A & B are relatively long lived objects. How do I make HotSpot of any use to me.? The one approach I discussed earlier of having a Timer method and purposely make both Model A & B objects short lived by *nulling* them at shorter intervals would not hold up memory and also allow HotSpot to do its stuff.
Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Originally posted by Alok Pota:
Interestingly, the model you described is exactly what I have

Not sure whether that should comfort you or not
The problem is that the data in Model A is huge and storing it in any global context (JSP application/servletcontext/JNDI) takes up and hangs on to some chunk of memory for the lifeof the server.

It's only one chunk, though. If you session-scope this, you would be hanging on to (a part of) it for every single client! Even though you are quickly expiring it, it will severely compromise the scalability of your application.
I assume the tree is built from information in a database. If the application-scope data model is really too large, what you can do is turn it into a cache. For instance, if the data in the nodes is bulky, you could store the entire tree structure but keep the data in a LRU cache. If model access has been abstracted well, this should completely be transparent to the session-bound state models.
Model B carries lightweight data which gets created per user. Model B data *hopefully* gets GC'd when the user session is invalidated, through an attempt to null Model B data in the valueUnbound(HttpSessionBindingEvent method)

Why is this so important? You don't know when garbage collection will run, but it will certainly run before you get an out of memory error. The only thing to watch out for is that you don't hang on to references you no longer need.
From what it seems that Runtime.getRuntime().gc() is a request and not a command and so there is no guarantee that all that replicated Model A & Model B data gets GC'd.

Correct. That should generally not be a problem though.
Whats worse is that users who close their browsers and log back in multiple times end up having replicated copies not cleared until the server kicks off their session

Another good reason why you shouldn't replicate what is essentially common data...
[...] *OutOfMemoryError* [...]

Have you tried playing with the memory settings (-Xmn64M -Xmx256M or whatever) and perhaps garbage collection settings?
Does that mean the old fear of creating too many short lived objects is uncalled for if you are using HotSpot?

HotSpot and other modern JVMs with generational garbage collection have made temporary objects a lot cheaper. There is no reason to be paranoid about temporary objects, but you still have every reason to avoid creating overly many of them.
- Peter
 
GeeCON Prague 2014
 
subject: Balancing good design versus performance & memory