wood burning stoves 2.0*
The moose likes Servlets and the fly likes Handling Uploaded Files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Servlets
Bookmark "Handling Uploaded Files" Watch "Handling Uploaded Files" New topic
Author

Handling Uploaded Files

Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
I'm working on an application in which I want the users to be able to upload a file. That file will then be processed on the server side. I'm planning on using FileUpload to handle the obtaining of the file from the request.

Each file will contain data that needs to go into a number of database tables and, potentially, this process could take quite a while. Therefore, to keep the application from hanging up for a particularly long time while all of these database inserts take place, my plan was for my web application to kick off a new thread and that new thread would be responsible for performing the file parsing/database inserts. That way, the user would be able to continue his/her work as soon as the file was fully uploaded. Certainly, the user would have to be warned that the data may not be available in the database immediately.

First of all, does that sound like a good idea? I can't say I've ever spawned a new thread from the server side of a web app before, but I don't see why that wouldn't work.

Anyway, now I come to my real question. It's possible that the file may have errors or may be missing data. I had already accounted for that and any error records will be put into an "Exceptions" table in the database for later review/correction. But, what happens if I get some sort of IOException just trying to read the file? In that case, I'm not even going to be able to get the file contents to put into that Exceptions table. And, as the servlet has already passed execution back to the user, I can't tell the user that an error occured and ask for the file again.

So, what do I do in that situation? One thought I've had is to force FileUpload to always write the files to the file system right away. Then, once I have the file, I can pass that (rather than an InputStream to the file) to my parser. At least, that way, if I run across some sort of IOException, I could set the entire file aside for later review. The downside, of course, is that I now end up with these extra files on my hard drive that I need to read from, which may slow down my processing. Also, I've never used FileUpload before, so I really don't know what the heck I'm talking about.

Anyone have any experience with such a situation? Any advice you might give?

Thanks,
Corey


SCJP Tipline, etc.
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30116
    
150

Corey,
I remember reading that you aren't supposed to explicitly use threads in a J2EE application. The prefered approach is MDBs/queues. (You may decide this is overkill, but at least it's a concious choice.)

We don't do file uploads, but we do have some big event processing. A way to handle that is to return the user to a temporary screen and use do a client-side refresh every X seconds to check on the status. The temporary screen could be a waiting/in-progress type screen of display some sort of information.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Jeanne,

I'm a little new to the whole web app world - any chance you could point me to some documentation on how to do that?

I know the types of screens you're talking about, that constantly check the status on the server to see how things are going, but I don't know how to implement that. When a new request comes in, how do you know which process it should be checking on? And, with that, how does the server continue to process without starting a new thread?

Perhaps all of my questions will be answered if I can find some information on MDBs/queues (or whatever those are). Any reason I shouldn't be starting a new thread on the server? Is there a problem with that? I didn't really like the idea, in the first place, but I sure don't know why I didn't.

Thanks.
Sonny Gill
Ranch Hand

Joined: Feb 02, 2002
Posts: 1211

Originally posted by Jeanne Boyarsky:

I remember reading that you aren't supposed to explicitly use threads in a J2EE application.


I may be totally out of my depth here, but I think that restriction is only applicable when using EJB's. With servlets, it may be ok to spawn your own threads. Please correct me if I am wrong.


The future is here. It's just not evenly distributed yet. - William Gibson
Consultant @ Xebia. Sonny Gill Tweets
Sonny Gill
Ranch Hand

Joined: Feb 02, 2002
Posts: 1211

AFAIK, the easiest way of implementing the self-refreshing pages is the
HTTP meta tag HTTP-EQUIV .http://www.htmlhelp.com/reference/html40/head/meta.html

or by using JavaScript to call the window.reload method in a 'timer' function.
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Sonny,

Thanks for the response. I know about the META tag so I know that I can make the browser resubmit automatically in a few seconds, but where does the request go? If I submit to a servlet, how does that servlet know how the other process is doing (it would have to know how far along that process is). Also, if that other process isn't a thread, what is this servlet monitoring?
Sonny Gill
Ranch Hand

Joined: Feb 02, 2002
Posts: 1211

Yeah...I kinda felt silly posting about the META tag, there was no way you wouldnt know it .

I guess it makes sense only if you are using threads.
The Servlet assigns an id to request, and after that the 'status' page sends that id each time it refreshes. And the servlet spawns a thread (or uses a helper class) to do the task, which notifies the servlet when it is done.

I would not dare comment on it any further, since I havent implemented anything like it myself.

cheers.
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30116
    
150

Corey,
Regardless of whether you use threads or MDBs, the last thing your request should do is write something to the HttpSession. (or you could use an in progress indicator that you remove when done - same id.) That way the polling servlet can check whether the request is finished.

I don't have any documentation on how to do it. I read it somewhere, at some point - but that doesn't help you . From your other posts, I recall you are using Struts, so here is a high level example in Struts terminology:
1) User does query
2) Have action place request for processing (with file) on queue
3) Have action write attribute to session for "in progress"
4) Forward user to just waiting jsp
5) Using meta tag or javascript, poll server
6) If "in progress" attribute is still in session, return to step 4
7) If "in progress" attribute is not in session, forward to completed page with done or error message

At some point during this process, your MDB is called with the file so it can actually store the data in the database.

Sonny,
The cause section of this IBM report clarifies a bit:
When application has spun its own threads from an EJB, accessing a database is not supported (per the J2EE specification). If a Servlet is spinning its own threads and accessing a database, the J2EE specification is not clear on this, so WebSphere Application Server 5.0 will allow it at this time. IBM is working with Sun to clarify this in the specification, so eventually (i.e. J2EE 1.4) spun threads from a Servlet accessing a database outside of a transaction will not be supported either.

So while it may not be illegal yet, it isn't a good idea. And some application servers may not support it. In any event, I wouldn't recommend spawning the thread.
Sonny Gill
Ranch Hand

Joined: Feb 02, 2002
Posts: 1211

Thanks for the link, Jeanne.

That sounds like a real limitation at the first glance though. What if there is a reasonable need to create a thread, and you are not using an EJB environment?
And since each request to a servlet is executed in its own thread anyway, what is so wrong about creating your own thread to do the processing, and letting the servlet thread finish?

Only if you were using EJB CMP, then it makes sense not to create your own thread since it might interfere with however the container is connecting to the database.

I googled for nearly an hour to find more information on this, but no luck..
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
Originally posted by Sonny Gill:
That sounds like a real limitation at the first glance though. What if there is a reasonable need to create a thread, and you are not using an EJB environment?


what Jeanne has mentioned above is,

- if a servlet is spinning its own threads and accessing DB ......blah blah

- application has spun its own thread from an EJB . . . . blah blah


it means if some other regular(plain) java class spins its own thread, then it would be Ok. isn't it Jeanne??

correct me if wrong.
Sonny Gill
Ranch Hand

Joined: Feb 02, 2002
Posts: 1211

Originally posted by adeel ansari:

it means if some other regular(plain) java class spins its own thread, then it would be Ok. isn't it Jeanne??


I suppose you mean that the servlet creates an instance of that class which starts a new thread, or the servlet calls a method on that class which starts a new thread. That's what I meant as well though, It's common practice for a servlet to delegate processing to a helper class.

It is the same for all practical purposes as far as understand.

Sonny
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Originally posted by Jeanne Boyarsky:
Regardless of whether you use threads or MDBs, the last thing your request should do is write something to the HttpSession. (or you could use an in progress indicator that you remove when done - same id.) That way the polling servlet can check whether the request is finished.


Jeanne,

Thanks for all the help - I really appreciate it. I'm starting to come up with a plan on how I want to do this, but I'm still a bit fuzzy on how this is implemented on the back end. I think my biggest problem is that, conceptually, I don't know what a queue or an MDB is (with respect to a web app).

I know that a queue is a first in, first out type structure and my guess is that, with multiple requests for processing, each request is process in the order in which they are received. With MDB's, I'm really lost - as far as I know an MDB is an Access Database file. Certainly, this acronym means something else to you, but I've never run across it before. What is an MDB?

Thanks,
Corey
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Wouldn't you know it - I did a search on Google for queues, MDB's, and J2EE and guess where I ended up - right back here on the Ranch. :roll:

I found this article in the JavaRanch Journal that seems to lay out a elegant solution for this exact problem. The down side is that solution, however elegant, seems remarkably complex (at least it does to me). I have very little experience with EJB's and no experience with MDB's so, if you have any advice on information I might use to try to implement this, I'm all ears.

Thanks,
Corey
Marilyn de Queiroz
Sheriff

Joined: Jul 22, 2000
Posts: 9044
    
  10
Actually I used that very article to implement a solution for a similar situation here at work. Then I passed it on to a couple of other people who also found it useful.


JavaBeginnersFaq
"Yesterday is history, tomorrow is a mystery, and today is a gift; that's why they call it the present." Eleanor Roosevelt
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30116
    
150

Adeel,
I agree with Sonny that it's not ok to circularly use threads. If a servlet calls a class which creates a thread, for all intents and purposes the servlet is creating a thread.

Having said that: if you are just using a servlet container such as Tomcat, EJBs aren't supported and are probably overkill. Just be aware that you are using something that may not be supported in the future. You want it to be a conscious decision to do something like this because it could result in problems. Tomcat probably would be ok, but you never know.
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30116
    
150

Corey,
Obviously, you know by now that an MDB is a message driven bean. WebSphere has excellent support for MDBs. Since you are using WebSphere, you will also want to also search for WebSphere MQ.
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30116
    
150

Corey,
One more thing: I don't remember if you are using WSAD. But if so, I found this IBM article to be very useful.
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Originally posted by Jeanne Boyarsky:
Corey,
One more thing: I don't remember if you are using WSAD. But if so, I found this IBM article to be very useful.


Jeanne - I love you. That article looks awesome. I'm sure I'll be going through it tomorrow.
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
Originally posted by Jeanne Boyarsky:
I agree with Sonny that it's not ok to circularly use threads. If a servlet calls a class which creates a thread, for all intents and purposes the servlet is creating a thread.


Hm, Ok. But i am a bit confused here. say if we want to do something after a certain period of time constantly, like a scheduled work. and we use Timer class in one of our POJO behind the servlet.

is it not right to do this?? because Timer class spins a thread.
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
Jeanne,
I would also like to appreciate the article.
thanks.
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30116
    
150

Originally posted by adeel ansari:


Hm, Ok. But i am a bit confused here. say if we want to do something after a certain period of time constantly, like a scheduled work. and we use Timer class in one of our POJO behind the servlet.

is it not right to do this?? because Timer class spins a thread.

It depends on what you want to do. If you aren't accessing a database, threads are fine. If you are, keep in mind that what you are doing is inconsistent with the spec. (I would use a cron job that calls your java program for that instead of a thread through.)
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
Well, I gave it a try...and I have failed.

As this was my first attempt to do anything at all with EJB's, I thought the article Jeanne pointed out would be an excellent place to start. I went through the entire article, blindly clicking the buttons that it told me to. My version of WSAD is slightly different than the one used in the article, but most screens were dientical or awfully close. However, when I get to the final part and I need to start the app server to test my MDB, I get a stack dump.

Here's part of that dump. Anyone know what these errors mean? Having no background in this type of work, I'm a bit lost.


[ November 03, 2004: Message edited by: Corey McGlone ]
Ravi Kandas
Greenhorn

Joined: Dec 07, 2004
Posts: 1
Hi corey,
I am in the same boat as you developing my first MDB. I followed the same steps as specified in the article but ended up with the same error as yours when starting the WSAD's test envirtonment server. Did you ever find a solution for this issue?
Your help is appreciated.
Thanks
Ravi
Sonny Gill
Ranch Hand

Joined: Feb 02, 2002
Posts: 1211

Ravi,

I kinda remember that Corey followed up on this problem in the IBM/Websphere forum. Try searching for this topic in that forum.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Handling Uploaded Files
 
Similar Threads
Long running request returns no HTTP response
File upload problem
String to FileUpload Object?
Store Files on a Server
Common's FileUpload query??