File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes Reading word Documents Using JAVA Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Reading word Documents Using JAVA" Watch "Reading word Documents Using JAVA" New topic

Reading word Documents Using JAVA

sushil grover

Joined: Nov 28, 2008
Posts: 12
Hi All,

I have to read Word document(97-2003), using java. And data is in tabular form as name value pairs. I have already implemented it using
apache POI. But I am facing some issue with that, as when i change the document of word of 2003 in office 2007, POI API throws null pointer exception related to
LittleEndian. When i saw apache site, it is a bug and will be resolved in coming releases.

My question here is has anybody used Open office for reading word documents. If yes please share some sample code.

Dear Friends, Please provide your opinions.

Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42965
The discussion is from 2005; whatever it talks about as being in the future, most likely has happened by now. Why do you think your problem is related to that, and why do you think endianness has anything to do with it? Can you post an SSCCE?
sushil grover

Joined: Nov 28, 2008
Posts: 12
Hi Ulf,

Thanks for your reply. I have tested it using latest POI version 3.7 as well as 3.8 beta release.

Below is the code snippet

InputStream fis = new FileInputStream(fileName);
POIFSFileSystem fs = new POIFSFileSystem(fis);
HWPFDocument doc = new HWPFDocument(fs);

Range range = doc.getRange();
for (int i=0; i<range.numParagraphs(); i++){
Paragraph tablePar = range.getParagraph(i); //Here i am getting exception
if (tablePar.isInTable()) {
Table table;
table = range.getTable(tablePar);
}catch(Exception e){
for (int rowIdx=0; rowIdx><table.numRows(); rowIdx++) {
TableRow row = table.getRow(rowIdx);
for (int colIdx=0; colIdx><row.numCells(); colIdx++) {
TableCell cell = row.getCell(colIdx);

I am getting following exception

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 16
at org.apache.poi.util.LittleEndian.getShort(
at org.apache.poi.hwpf.sprm.SprmOperation.getOperand(
at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.unCompressPAPOperation(
at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(
at org.apache.poi.hwpf.model.PAPX.getParagraphProperties(
at org.apache.poi.hwpf.usermodel.Range.getParagraph(

Sometimes it is nullpointer exception while uncompressing.

I agree. Here's the link:
subject: Reading word Documents Using JAVA
It's not a secret anymore!