• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Handling Uploaded Files

 
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm working on an application in which I want the users to be able to upload a file. That file will then be processed on the server side. I'm planning on using FileUpload to handle the obtaining of the file from the request.

Each file will contain data that needs to go into a number of database tables and, potentially, this process could take quite a while. Therefore, to keep the application from hanging up for a particularly long time while all of these database inserts take place, my plan was for my web application to kick off a new thread and that new thread would be responsible for performing the file parsing/database inserts. That way, the user would be able to continue his/her work as soon as the file was fully uploaded. Certainly, the user would have to be warned that the data may not be available in the database immediately.

First of all, does that sound like a good idea? I can't say I've ever spawned a new thread from the server side of a web app before, but I don't see why that wouldn't work.

Anyway, now I come to my real question. It's possible that the file may have errors or may be missing data. I had already accounted for that and any error records will be put into an "Exceptions" table in the database for later review/correction. But, what happens if I get some sort of IOException just trying to read the file? In that case, I'm not even going to be able to get the file contents to put into that Exceptions table. And, as the servlet has already passed execution back to the user, I can't tell the user that an error occured and ask for the file again.

So, what do I do in that situation? One thought I've had is to force FileUpload to always write the files to the file system right away. Then, once I have the file, I can pass that (rather than an InputStream to the file) to my parser. At least, that way, if I run across some sort of IOException, I could set the entire file aside for later review. The downside, of course, is that I now end up with these extra files on my hard drive that I need to read from, which may slow down my processing. Also, I've never used FileUpload before, so I really don't know what the heck I'm talking about.

Anyone have any experience with such a situation? Any advice you might give?

Thanks,
Corey
 
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Corey,
I remember reading that you aren't supposed to explicitly use threads in a J2EE application. The prefered approach is MDBs/queues. (You may decide this is overkill, but at least it's a concious choice.)

We don't do file uploads, but we do have some big event processing. A way to handle that is to return the user to a temporary screen and use do a client-side refresh every X seconds to check on the status. The temporary screen could be a waiting/in-progress type screen of display some sort of information.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jeanne,

I'm a little new to the whole web app world - any chance you could point me to some documentation on how to do that?

I know the types of screens you're talking about, that constantly check the status on the server to see how things are going, but I don't know how to implement that. When a new request comes in, how do you know which process it should be checking on? And, with that, how does the server continue to process without starting a new thread?

Perhaps all of my questions will be answered if I can find some information on MDBs/queues (or whatever those are). Any reason I shouldn't be starting a new thread on the server? Is there a problem with that? I didn't really like the idea, in the first place, but I sure don't know why I didn't.

Thanks.
 
Ranch Hand
Posts: 1211
Mac IntelliJ IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jeanne Boyarsky:

I remember reading that you aren't supposed to explicitly use threads in a J2EE application.



I may be totally out of my depth here, but I think that restriction is only applicable when using EJB's. With servlets, it may be ok to spawn your own threads. Please correct me if I am wrong.
 
Sonny Gill
Ranch Hand
Posts: 1211
Mac IntelliJ IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
AFAIK, the easiest way of implementing the self-refreshing pages is the
HTTP meta tag HTTP-EQUIV .http://www.htmlhelp.com/reference/html40/head/meta.html

or by using JavaScript to call the window.reload method in a 'timer' function.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sonny,

Thanks for the response. I know about the META tag so I know that I can make the browser resubmit automatically in a few seconds, but where does the request go? If I submit to a servlet, how does that servlet know how the other process is doing (it would have to know how far along that process is). Also, if that other process isn't a thread, what is this servlet monitoring?
 
Sonny Gill
Ranch Hand
Posts: 1211
Mac IntelliJ IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yeah...I kinda felt silly posting about the META tag, there was no way you wouldnt know it .

I guess it makes sense only if you are using threads.
The Servlet assigns an id to request, and after that the 'status' page sends that id each time it refreshes. And the servlet spawns a thread (or uses a helper class) to do the task, which notifies the servlet when it is done.

I would not dare comment on it any further, since I havent implemented anything like it myself.

cheers.
 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Corey,
Regardless of whether you use threads or MDBs, the last thing your request should do is write something to the HttpSession. (or you could use an in progress indicator that you remove when done - same id.) That way the polling servlet can check whether the request is finished.

I don't have any documentation on how to do it. I read it somewhere, at some point - but that doesn't help you . From your other posts, I recall you are using Struts, so here is a high level example in Struts terminology:
1) User does query
2) Have action place request for processing (with file) on queue
3) Have action write attribute to session for "in progress"
4) Forward user to just waiting jsp
5) Using meta tag or javascript, poll server
6) If "in progress" attribute is still in session, return to step 4
7) If "in progress" attribute is not in session, forward to completed page with done or error message

At some point during this process, your MDB is called with the file so it can actually store the data in the database.

Sonny,
The cause section of this IBM report clarifies a bit:

When application has spun its own threads from an EJB, accessing a database is not supported (per the J2EE specification). If a Servlet is spinning its own threads and accessing a database, the J2EE specification is not clear on this, so WebSphere Application Server 5.0 will allow it at this time. IBM is working with Sun to clarify this in the specification, so eventually (i.e. J2EE 1.4) spun threads from a Servlet accessing a database outside of a transaction will not be supported either.


So while it may not be illegal yet, it isn't a good idea. And some application servers may not support it. In any event, I wouldn't recommend spawning the thread.
 
Sonny Gill
Ranch Hand
Posts: 1211
Mac IntelliJ IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the link, Jeanne.

That sounds like a real limitation at the first glance though. What if there is a reasonable need to create a thread, and you are not using an EJB environment?
And since each request to a servlet is executed in its own thread anyway, what is so wrong about creating your own thread to do the processing, and letting the servlet thread finish?

Only if you were using EJB CMP, then it makes sense not to create your own thread since it might interfere with however the container is connecting to the database.

I googled for nearly an hour to find more information on this, but no luck..
 
Ranch Hand
Posts: 2874
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Sonny Gill:
That sounds like a real limitation at the first glance though. What if there is a reasonable need to create a thread, and you are not using an EJB environment?



what Jeanne has mentioned above is,

- if a servlet is spinning its own threads and accessing DB ......blah blah

- application has spun its own thread from an EJB . . . . blah blah


it means if some other regular(plain) java class spins its own thread, then it would be Ok. isn't it Jeanne??

correct me if wrong.
 
Sonny Gill
Ranch Hand
Posts: 1211
Mac IntelliJ IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by adeel ansari:

it means if some other regular(plain) java class spins its own thread, then it would be Ok. isn't it Jeanne??



I suppose you mean that the servlet creates an instance of that class which starts a new thread, or the servlet calls a method on that class which starts a new thread. That's what I meant as well though, It's common practice for a servlet to delegate processing to a helper class.

It is the same for all practical purposes as far as understand.

Sonny
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jeanne Boyarsky:
Regardless of whether you use threads or MDBs, the last thing your request should do is write something to the HttpSession. (or you could use an in progress indicator that you remove when done - same id.) That way the polling servlet can check whether the request is finished.



Jeanne,

Thanks for all the help - I really appreciate it. I'm starting to come up with a plan on how I want to do this, but I'm still a bit fuzzy on how this is implemented on the back end. I think my biggest problem is that, conceptually, I don't know what a queue or an MDB is (with respect to a web app).

I know that a queue is a first in, first out type structure and my guess is that, with multiple requests for processing, each request is process in the order in which they are received. With MDB's, I'm really lost - as far as I know an MDB is an Access Database file. Certainly, this acronym means something else to you, but I've never run across it before. What is an MDB?

Thanks,
Corey
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Wouldn't you know it - I did a search on Google for queues, MDB's, and J2EE and guess where I ended up - right back here on the Ranch. :roll:

I found this article in the JavaRanch Journal that seems to lay out a elegant solution for this exact problem. The down side is that solution, however elegant, seems remarkably complex (at least it does to me). I have very little experience with EJB's and no experience with MDB's so, if you have any advice on information I might use to try to implement this, I'm all ears.

Thanks,
Corey
 
Sheriff
Posts: 9109
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Actually I used that very article to implement a solution for a similar situation here at work. Then I passed it on to a couple of other people who also found it useful.
 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Adeel,
I agree with Sonny that it's not ok to circularly use threads. If a servlet calls a class which creates a thread, for all intents and purposes the servlet is creating a thread.

Having said that: if you are just using a servlet container such as Tomcat, EJBs aren't supported and are probably overkill. Just be aware that you are using something that may not be supported in the future. You want it to be a conscious decision to do something like this because it could result in problems. Tomcat probably would be ok, but you never know.
 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Corey,
Obviously, you know by now that an MDB is a message driven bean. WebSphere has excellent support for MDBs. Since you are using WebSphere, you will also want to also search for WebSphere MQ.
 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Corey,
One more thing: I don't remember if you are using WSAD. But if so, I found this IBM article to be very useful.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jeanne Boyarsky:
Corey,
One more thing: I don't remember if you are using WSAD. But if so, I found this IBM article to be very useful.



Jeanne - I love you. That article looks awesome. I'm sure I'll be going through it tomorrow.
 
Adeel Ansari
Ranch Hand
Posts: 2874
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jeanne Boyarsky:
I agree with Sonny that it's not ok to circularly use threads. If a servlet calls a class which creates a thread, for all intents and purposes the servlet is creating a thread.



Hm, Ok. But i am a bit confused here. say if we want to do something after a certain period of time constantly, like a scheduled work. and we use Timer class in one of our POJO behind the servlet.

is it not right to do this?? because Timer class spins a thread.
 
Adeel Ansari
Ranch Hand
Posts: 2874
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jeanne,
I would also like to appreciate the article.
thanks.
 
Jeanne Boyarsky
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by adeel ansari:


Hm, Ok. But i am a bit confused here. say if we want to do something after a certain period of time constantly, like a scheduled work. and we use Timer class in one of our POJO behind the servlet.

is it not right to do this?? because Timer class spins a thread.


It depends on what you want to do. If you aren't accessing a database, threads are fine. If you are, keep in mind that what you are doing is inconsistent with the spec. (I would use a cron job that calls your java program for that instead of a thread through.)
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, I gave it a try...and I have failed.

As this was my first attempt to do anything at all with EJB's, I thought the article Jeanne pointed out would be an excellent place to start. I went through the entire article, blindly clicking the buttons that it told me to. My version of WSAD is slightly different than the one used in the article, but most screens were dientical or awfully close. However, when I get to the final part and I need to start the app server to test my MDB, I get a stack dump.

Here's part of that dump. Anyone know what these errors mean? Having no background in this type of work, I'm a bit lost.


[ November 03, 2004: Message edited by: Corey McGlone ]
 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi corey,
I am in the same boat as you developing my first MDB. I followed the same steps as specified in the article but ended up with the same error as yours when starting the WSAD's test envirtonment server. Did you ever find a solution for this issue?
Your help is appreciated.
Thanks
Ravi
 
Sonny Gill
Ranch Hand
Posts: 1211
Mac IntelliJ IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ravi,

I kinda remember that Corey followed up on this problem in the IBM/Websphere forum. Try searching for this topic in that forum.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic