• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Separating JMS Producer/Consumer on JBoss 4.0.2

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

JMS is an area that I am relatively new to at the moment. I help maintain an application which uses JMS messages to queue up operations for background processing; but it has always "just worked" and I haven't had to change much about that, so I never got the opportunity to learn much about it other than the fact that it was a nice way of setting up queues to run processes.

Well, here lately, I'm starting to suspect that some of the larger processes we have running in our bean are crashing the server. Basically, the web layer of the server is just becoming unresponsive 1-3 times per week. The frequency of these problems only increased after I added several large excel report generating classes to our bean, which can cause some fairly CPU/memory intensive processes to be invoked. I don't know this for sure, because the server often leaves no error logs around the time of the crash, it just becomes non-responsive, yet the process is still running. I will often see CPU in use, which I have found to be JMS related processes still running in the background. I don't know for sure, maybe they're unrelated; but I still want to separate the bean processes into their own application server.

What I'd like to do is start by separating the producer/consumer. So I want one JBoss application server node to produce messages and a completely different one to consume those messages. Once I figure this out and get this working in production, I'd like to have it cluster, so I can have one or more producers and a cluster of consumers, to help with our redundancy.

I have been able to spin up 2 nodes and move our bean jar file to the other node; but I can't get it to consume the messages. The only way it will consume them is if I restart the server, somehow its allowed to consume when the server first starts up; but not afterwards.

I am starting to find that, even with the documentation, this isn't as simple as I had hoped. Compounding this problem, is the fact that I am not familiar enough with the technology to know what to look for very effectively, or even where to start. That's why I wanted to get some feedback from the smart folks at this forum. Any words of wisdom for a greenhorn who wants to set something like this up.

I'll provide what info here I think is necessary to understand what's going on, please let me know if anything else is needed:

Here's how we're sending the message.



Here is our jms-ds.xml file
 
Greenhorn
Posts: 24
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello, consider that the MDB could crash the Application server.
This is why.
The MDB onMessage() method is annotated with a @Transaction Attribute. There are 2 different Transaction Attribute that Could be used: Required and Not_Supported.
If no one is used the default is Required.
The problem is that if the method thrown an exception (RuntimeException) the Transaction associated is rolledback and also the message is put back in the Queue ( if the Queue is used as destination of the message)
When the message is in the queue the MDB receive again the message (maybe another instance of the MDB in the Pool ) and if the RuntimeException is re-thrown everything start from the beginning.
This could crash the Application server.

I hope this could help You.
 
Ranch Hand
Posts: 39
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

massimiliano cattaneo wrote:
The problem is that if the method thrown an exception (RuntimeException) the Transaction associated is rolledback and also the message is put back in the Queue ( if the Queue is used as destination of the message)



According to spec all RuntimeException (System Exceptions) should be logged, so I think it would be visible in the logs if that was the case.

I don't know this for sure, because the server often leaves no error logs around the time of the crash, it just becomes non-responsive, yet the process is still running. I will often see CPU in use, which I have found to be JMS related processes still running in the background. I don't know for sure, maybe they're unrelated; but I still want to separate the bean processes into their own application server.



I don't know the details of the architecture, but I guess that the majority of the work is done on the JMS consumer side. In this case indeed spiting up the job among several consumers make sense, but before doing that you really need to be sure that it's necessary. Having everything in one JVM is often always much simpler (especially from maintenance pov).
 
Chris Case
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Not quite sure why, but most of the crashes we've been experiencing have no error log whatsoever. The only thing I have to go on, as a clue, is the last message logged. I think there are at least 2 different causes. I have a possible solution for the first cause, which I believe has to do with an occasional Struts mapping.findForward infinite loop redirect. The second probably has to do with the large excel reports we're generating with the bean, perhaps using up too many resources.

I suppose I should start by adding more log4j debug entries into the areas I suspect are problematic and perhaps even adding a logfile just for that area of the system. As for why the entire web server would be brought down, it seems like it must be a bug in JBoss AS 4.0.2, since I'd imagine the server should prevent such situations. Maybe it would be worth upgrading to at least JBoss AS 4.0.3, since we have other installations running on that successfully already. I'd like to get to the latest and greatest; but that's going to require some figuring out, as I hear it isn't exactly a trivial task.

In regards to our JMS consumers/producers and the maintenance requirements of running multiple application servers. The consumer is working alot harder than the producer, that's why we are using JMS, so we can queue up large processes for the processing of hundreds of PDFs (possibly all at once at times) and large excel reports. Currently it is a bit of a maintenance hassle because it is crashing once or more per week, interrupting the normal use of the application, so even having more application servers wouldn't be bad in comparison. If it keeps the web server from crashing and interrupting the user, it would be worth the extra work, until we can figure out exactly what is happening.

Anyone have any suggestions on how to figure out exactly what is happening aside from extra log entries? This is something I can't really figure out in the debugger because it works just fine 99% of the time.

Valery Lezhebokov wrote:

massimiliano cattaneo wrote:
The problem is that if the method thrown an exception (RuntimeException) the Transaction associated is rolledback and also the message is put back in the Queue ( if the Queue is used as destination of the message)



According to spec all RuntimeException (System Exceptions) should be logged, so I think it would be visible in the logs if that was the case.

I don't know this for sure, because the server often leaves no error logs around the time of the crash, it just becomes non-responsive, yet the process is still running. I will often see CPU in use, which I have found to be JMS related processes still running in the background. I don't know for sure, maybe they're unrelated; but I still want to separate the bean processes into their own application server.



I don't know the details of the architecture, but I guess that the majority of the work is done on the JMS consumer side. In this case indeed spiting up the job among several consumers make sense, but before doing that you really need to be sure that it's necessary. Having everything in one JVM is often always much simpler (especially from maintenance pov).


 
Greenhorn
Posts: 7
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What version of JDK is in use? May be the standard JDK monitoring tools like jstack,jconsole... You need two things to troubleshoot further
- Thread dumps
- GC/Heap Dumps

 
Chris Case
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Baski Reddy wrote:What version of JDK is in use? May be the standard JDK monitoring tools like jstack,jconsole... You need two things to troubleshoot further
- Thread dumps
- GC/Heap Dumps



We're currently using OpenJDK. here are the specifics:



Thanks for the tip. I found a way to take a thread dump using the built-in utility "twiddle".



I have attached what the thread dump during normal operation looks like in links.

I'm going to take one of these snapshots during the next crash. Also, I'll be sure that garbage collection logging is taking place.

I've got it running with a command line arg similar to:



What I may do, as a general rule, is have a script which runs when a crash is reported. I could have it take the thread dump, tail certain log files, write various other info, zip it up and email it to me. It will be interesting to see what the thread dump shows next time I have to do a restart.
Picture-6.png
[Thumbnail for Picture-6.png]
What the thread dump looks like in links
 
Chris Case
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I did a thread dump during an instance where the system was "locked up" and I think this more or less provides the information I need.

I see about 500 different threads, most of them in threadState:BLOCKED, in one way or another by the getHibernateSession() call we use to open up a hibernate session to read from the database.

Here is one of the stack traces from the dump:



My first instinct is to go through some of these Actions and look for situations where we can avoid having to open and use a hibernate session. I'm sure there are situations where these become nested. If say you have a Hibernate session opened for general use, then you call a function which opens a Hibernate session, etc, etc. I can already see, after reviewing the code, that there are places where we'd be better off loading this information from a session variable instead of loading from the database.

EDIT: When I looked towards the beginning of the "thread group: main", where these stacks first start to appear, I see this. I see a WAITING thread (related to hibernate session) with an awaitAcquire method near the top, followed by a deluge of BLOCKED threads. Not sure yet if this is significant; but it is worth noting.

 
Chris Case
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It looks like the thread dump helped to find the root cause and ultimately solve the problem once and for all.

The problem appears to have been related to our connection pool max_size. It looks like it was too small of a connection pool in the c3p0 configuration for Hibernate, the ORM middleware we are using. We were using a c3p0.max_size of 20, which was too small for the amount of activity on that server. This has been changed to 150.

Without the thread dump, this was an elusive problem, as there were not clear indicators of what was happening to freeze the server. However, seeing the thread dump and seeing how everything that was blocked was waiting on a HibernateSession, then seeing the WAITING thread stack trace, which shows that it was waiting for a connection; a few google searches later I had an answer.

Here is a snippet of that stack trace which gave the vital clue, full stack trace is in previous message:



This is the hibernate.cfg.xml file I was referring to. In case anyone needs it for the context of what I'm talking about:



Anyhow, I appreciate the help you guys gave me very much. This issue has been bugging me for many months and I was wondering how, if ever, I was going to get past it. Like most things, the answer was very simple, once I knew where to look. The use of thread dumps is sure to be of great use in the future.

I've already written a script which I am going to use to monitor for blocked threads in the future. I'll include it here for feedback, or if anyone else wants it for their own use. I run it as a cron job every minute

 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic