GeeCON Prague 2014*
The moose likes Servlets and the fly likes Large File Upload Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Servlets
Bookmark "Large File Upload" Watch "Large File Upload" New topic
Author

Large File Upload

Harry Anan
Greenhorn

Joined: Sep 03, 2009
Posts: 3
Dear Experts,

I have a requirement where a large file (100 -200MB) is uploaded from the client to a content management system. I am using a servlet with Apache Commons File Upload API. Apache FileUpload has 2 ways of handling files,

1) Non-Streaming
2) Streaming

Currently I use the Non-Streaming approach where the servlet stores the file in a temp location and upload the same into the content management system - This is taking lot of time so I am trying to implement Streaming API.

Content Management API supports streaming in 2 methods,
a) SetContent - Takes the file's ByteArrayOutputStream as input -> This gives OutOfMemoryException because the file being large
b) AppendContent - Takes the file's ByteArrayOutputStream as input -> This method can be called multiple times to upload the large file but I dont know how to do this. The Apache File Upload gives InputStream of the file and I need to split that into chuncks and append into the content management system.

Can someone guide me how to convert InputStream to 4KB ByteArrayOutputStream so that I can use the AppendContent method in content management API?

Thanks in advance
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

You mean, like, writing a simple loop which reads bytes from the InputStream and writes them to the ByteArrayOutputStream? Or is there something else besides just copying the data which you are asking about?
Harry Anan
Greenhorn

Joined: Sep 03, 2009
Posts: 3
Yes. Reading 4096 bytes from the InputStream using a loop and creating a ByteArrayOutputStream out of the 4096 bytes and appending that into the content management system. I am using read(byte[] b, int off, int len) but it looks like infinite loop no errors in servlet even after 20 mins. This is the snippet from my servlet -please let me know if I miss something.





Also is this going to take lot of CPU time because of reading 100+MB in 4KB chunks?

Thanks a lot
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Harry Anan wrote:... but it looks like infinite loop no errors in servlet even after 20 mins. This is the snippet from my servlet -please let me know if I miss something.

Also is this going to take lot of CPU time because of reading 100+MB in 4KB chunks?


It "looks like" infinite loop? You haven't put debugging statements in there to see what's really going on? Better still you should be debugging this code in a standalone Java class, rather than using a servlet container as your test base.

And yes, it's going to take a lot of CPU time to read 100 MB of data. And more to the point, it's going to take a lot of elapsed time in your case because the data is coming slowly over the network. You should expect your code which processes the upload to run faster than the data arrives, so don't waste your time worrying about CPU time.

As for the code, I don't understand why you create a new ByteArrayOutputStream for each chunk of data you read. You could be creating a new ByteArrayOutputStream for each byte in the worst case. That code should be outside the loop which fills the buffer. And you're always writing 4096 bytes from the buffer even if you didn't read 4096 bytes from the buffer. This will give you extra junk at the end of the file in most cases. (In your code you break out of the loop, so you throw away the last buffer, but that should be fixed if you put the writing outside the loop.)
karthikeyan Chockalingam
Ranch Hand

Joined: Sep 06, 2003
Posts: 259
Hope this working example with Apache Commons FileUpload helps.
http://www.skillassert.com/tutorial/9_Simple_file_upload_tutorial_using_Apache_Commons_fileupload_with_build_by_Maven


http://www.skillassert.com


Harry Anan
Greenhorn

Joined: Sep 03, 2009
Posts: 3
I don't understand why you create a new ByteArrayOutputStream for each chunk of data you read. You could be creating a new ByteArrayOutputStream for each byte in the worst case. That code should be outside the loop which fills the buffer.


BullsEye!!

Thanks a lot for the correction, Paul. After I reused the same stream instance, I am able to upload a 150MB file in 3 mins which is simply great.





Nguyen Ninh
Greenhorn

Joined: Jan 15, 2013
Posts: 4
Harry Anan, I met the same problem but I don't know how to resolve (Using FileUpload API, Content Management API...).
So, could you give me some code snippet.
Thanks.
 
GeeCON Prague 2014
 
subject: Large File Upload