File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Other Open Source Projects and the fly likes OutOfMemoryError in processing large xlsx file (167 MB) using Apache POI Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "OutOfMemoryError in processing large xlsx file (167 MB) using Apache POI" Watch "OutOfMemoryError in processing large xlsx file (167 MB) using Apache POI" New topic
Author

OutOfMemoryError in processing large xlsx file (167 MB) using Apache POI

Harinath Br
Greenhorn

Joined: Aug 28, 2012
Posts: 4
Hi,

I am using Apache POI version3.8 to process xlsx file (size-167 MB with around 7 lakh records). I want to upload this data to database.

I am getting java.lang.OutOfMemoryError at the below line.

Workbook workbook = WorkbookFactory.create(new File(fileDirectory, fileName));

Even if I use XSSFWorkbook also same issue...

XSSFWorkbook xssfWb = new XSSFWorkbook(opcPackage);

any one please help me in resolving this issue.

Thanks in advance.
Hari...
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18125
    
    8

The XLSX file takes up 167 megabytes of disk space? Then that's not surprising. Assign more memory to the application, as much as possible.
Harinath Br
Greenhorn

Joined: Aug 28, 2012
Posts: 4
Thanks Paul for your quick reply.
I tried that but it didn't help. Currently I assigned 1024M, even I doubled it but no luck.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39547
    
  27
POI is notorious for needing lots of memory; it's quite possible that a 167MB file needs more than 2 GB.


Ping & DNS - updated with new look and Ping home screen widget
Harinath Br
Greenhorn

Joined: Aug 28, 2012
Posts: 4
Dittmer,

I tried that but JDeveloper is not at all responding even for just java standalone program with System.out.println("");

I have 4 GB RAM.

Please suggest is there any way to stream the data partly instead of loading entire file using Workbook workbook = WorkbookFactory.create(new File(fileDirectory, fileName)); or XSSFWorkbook xssfWb = new XSSFWorkbook(opcPackage); bcz I am getting error at this line only.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39547
    
  27
I'm fairly certain that POI can only deal with complete files; there's no way to make it read partial files (and thus lower the memory consumption). That line is probably where POI tries to read the entire file into memory.

(My first name is Ulf, by the way. It's fine to address me with my last name if you prefer to do so, but then the polite way to do so is by prefixing it with "Mr.".)
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18125
    
    8

Harinath Br wrote:Please suggest is there any way to stream the data partly instead of loading entire file using Workbook workbook = WorkbookFactory.create(new File(fileDirectory, fileName)); or XSSFWorkbook xssfWb = new XSSFWorkbook(opcPackage); bcz I am getting error at this line only.


No, that's not how the WorkbookFactory class works.

It might be possible to use other features of the POI project to provide a "streamed" version of the data, but I don't know that for sure. Somehow I doubt it, because I'm not sure what exactly would be streamed. It's easy for somebody who just thinks of Excel as a grid with cells which might contain data to ask for that, but for somebody who has to write code which can deal with all of the thousands of different kinds of objects which might be present in an Excel document, it isn't really all that simple.
Harinath Br
Greenhorn

Joined: Aug 28, 2012
Posts: 4

Thanks everyone for the reply.

I have done in this way... I have split the file with 20k records per file (around 5MB) as workaround in this case.
Meer Nasirudeen
Greenhorn

Joined: Sep 30, 2007
Posts: 4
Hello everyone... I too faced the same issue of OOM while parsing xlsx file... I somehow feel that below code helped me in a simpler way to handle..

This code is based on sjxlsx. It reads the xlsx and stores in a HSSF sheet.



Olivier Rihoux
Greenhorn

Joined: Jun 03, 2013
Posts: 8
Just try replacing :




by :




SXSSFWorkbook (Streaming version of XSSFWorkbook) will act the same way as XSSFWorkbook but keeps only 100 (in this case) rows in memory

Meer Nasirudeen
Greenhorn

Joined: Sep 30, 2007
Posts: 4
Olivier Rihoux wrote:Just try replacing :




by :




SXSSFWorkbook (Streaming version of XSSFWorkbook) will act the same way as XSSFWorkbook but keeps only 100 (in this case) rows in memory



Yup. You are absolutely correct. However, the solution I have suggested is to get the entire worksheet from the high-size file. I already tried using SXSSF, but as you said, beyond last 100 records are getting retrieved while reading back the sheet
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: OutOfMemoryError in processing large xlsx file (167 MB) using Apache POI
 
Similar Threads
sheet.getLastRowNum() returns -1
Working with Big Excel Sheet
file Writing problem
Opening an Excel file
Read a .xls,.xlsx file format using XSSF