aspose file tools*
The moose likes Servlets and the fly likes Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Servlets
Bookmark ""Right Way" to kick off CPU-intensive task from servlet" Watch ""Right Way" to kick off CPU-intensive task from servlet" New topic
Author

"Right Way" to kick off CPU-intensive task from servlet

Jeremy Dillworth
Greenhorn

Joined: Nov 25, 2003
Posts: 3
Please excuse me if this is not the right forum to post this question.

I have a servlet-based application that generates quite complex CSS/HTML based on the formating of a spreadsheet. Our customer has requested that we offer PDF versions of this HTML.

I had looked into using OpenOffice/UNO to generate PDF's but UNO is very complex and we'd have to generate a separate "tidy printing" version of the spreadsheet, which is undesirable.

After some searching we found a tool called webisor (http://www.davisor.com/webisor/) which does a good job turning HTML/CSS into PDF. Trouble is, it takes 2 minutes on a Pentium 4 @ 2.6GHz, which is not something we'd like to have running on our web server and certainly not within Tomcat.

I'm planning on queueing up PDF jobs from within Tomcat and then using a standalone app to do the heavy lifting.

I'm trying to figure out what the "Right Way" is to put a PDF job on the queue and then get the PDF back to Tomcat when it's generated. I figure the UI will probably be a little pop-up window that refreshes every 30 seconds or so saying "please wait your PDF will be ready soon", and then downloads the PDF when it's ready.

I had thought the easiest way to do it would be a shared filesystem. Within Tomcat, write out a file with all the data. Then the standalone app would pick up the file, produce the HTML, convert to PDF, and put the PDF where Tomcat could find it.

This seems pretty inelegant. Also, I have to write the files with some bogus name so that a process watching for a file doesn't see it before it's completely written.

I could use a database table or two, which would solve the file renaming issue (queue entries would not be seen until the transaction was commited), but I'd still have to poll the database table which seems rather inelegant as well.

RMI seems like it might work well, but I need to be careful with large PDF's.

Ideas?

Thanks in advance,

Jeremy
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17250
    
    6

What about using FOP/XSLT or IText for the job of converting it to PDF.

Or how about if the user provides an e-mail address, then you can send the request to a Message Driven Bean asynchronous call. This way the user gets a page back really quick stating that the PDF is being processed and will appear in their email inbox, and they can go on from there.

Mark


Perfect World Programming, LLC - Two Laptop Bag - Tube Organizer
How to Ask Questions the Smart Way FAQ
Jeremy Dillworth
Greenhorn

Joined: Nov 25, 2003
Posts: 3
I like the idea of email. That's much more elegant than making the user wait. Though last time I looked at the JavaMail API, adding attachments didn't look simple.

EJB's are more or less deprecated at this shop. I'm not sure if anyone's moving towards POJO's in a container yet (that is the alternative isn't it?).

XSLT might work better in the long run. I need a fair amount of control over the HTML produced, since it has to work as part of an AJAX-style interface. For now I think I may need to stick with HTML-2-PDF conversion, since I can get it all working in a couple of days (which is as long as I have ).

Thanks
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 60992
    
  65

In our system we have some reports that take a long time to generate. When such a report is requested, we fire off the processing in a background process and return back to a page that tells the user that the report is being processed. When it is finished we send an email that notifies the user that the report is ready, but we do not send the report as an attachment. Rather, we provide an in-app "reports inbox" that they can visit to download the report through the browser. This inbox also shows the list of reports that are "in progress" so that they can keep tabs on what's going on.


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 60992
    
  65

P.S. This also allows us to "auto-schedule" the generation of such reports. They kick off at a time of the user's choosing and get sent to the "reports inbox" when completed.
Jeremy Dillworth
Greenhorn

Joined: Nov 25, 2003
Posts: 3
Originally posted by Bear Bibeault:
In our system we have some reports that take a long time to generate. When such a report is requested, we fire off the processing in a background process and return back to a page that tells the user that the report is being processed. When it is finished we send an email that notifies the user that the report is ready, but we do not send the report as an attachment. Rather, we provide an in-app "reports inbox" that they can visit to download the report through the browser. This inbox also shows the list of reports that are "in progress" so that they can keep tabs on what's going on.


I guess what I'm looking for is what kind of "background process" is that? Is it a thread within Tomcat? A cronjob written in Java? An EJB? POJO?
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
I have done fairly large PDF documents from a servlet using FOP running in the servlet request thread. BUT - this was a situation where we know for sure than only one request is being processed at a time. FOP - and I assume other PDF creation packages - can be very memory intensive since they have to build the whole thing in memory.

It seems to me you are indeed going to need a system that creates a "Job" object which gets queued up for a worker application so that only one rendering job is running at a time. If you did this with JMS or JavaSpaces the rendering work could be done by another JVM - either on your server machine or elsewhere on the network.
Bill
[ October 26, 2005: Message edited by: William Brogden ]
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 60992
    
  65

I guess what I'm looking for is what kind of "background process" is that? Is it a thread within Tomcat?


In this instance, yes. For some other background type of things, we run a daemon process that triggers processing from info in the database.
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
I guess what I'm looking for is what kind of "background process" is that? Is it a thread within Tomcat? A cronjob written in Java? An EJB? POJO?


Queuing is pretty well suited to this. Send a JMS message to another JVM or another machine running all the PDF generation.

I like mailing the user a link to the results, too. We use a commercial product that moves attachments to a secure server and manages passwords for regular or one-time users to come back and get their results.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Alec Lee
Ranch Hand

Joined: Jan 28, 2004
Posts: 569
Would anyone still consider EJB (I mean stateless session bean only) a suitable solution to distribute the loading to another machine.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: "Right Way" to kick off CPU-intensive task from servlet