It's not a secret anymore!*
The moose likes I/O and Streams and the fly likes Is it possible to read a word document using FileReader or FileInputStream? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Is it possible to read a word document using FileReader or FileInputStream?" Watch "Is it possible to read a word document using FileReader or FileInputStream?" New topic
Author

Is it possible to read a word document using FileReader or FileInputStream?

parag Chatterjee
Greenhorn

Joined: Aug 02, 2002
Posts: 28
Is it possible to read a word document using FileReader or FileInputStream?
I am trying it in vain.The reading is giving some unknown characters.
This is my snippet of code:
//try{
finp = new FileReader("d:\\HelloWorld.doc");
// fout = new FileWriter("d:\\HelloWorld11.doc");
//while(finp.read()!=-1)
//{

int intRead= finp.read(b);
//fout.write(b);
input = new String(b);
System.out.println("" + input);
//}
finp.close();
David Weitzman
Ranch Hand

Joined: Jul 27, 2001
Posts: 1365
No.
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Yes it is possible. The only thing you need to change in the code below is the MIME type. Keep in mind that the code below was written to stream PDF files. I didn't bother going through and changing the code to state that it is for word files. So you'll see PDF referrals throughout the code. Anyway.. back to what I was talking about. Below you will need to change one line. The new line of code will be...

The great thing about the file streaming is that it comes up fast even w/larger files. I have 35 page documents come up very quickly. If you wanted to read Excel Spreadsheet programs you would change it to be....

Remember if you do this, the plug in must already be on the system for the browser to responde. I put some comments in the code showing you how to have the document come up in the browser window or in a seperate window.

This code will probably sit in its own servlet. This method will be called by doGet or doPost depending on how you call it. From the signature of the code you can see I pass in the page to be dispatched. The pdf will be found from the root of the web directory.
I call my PDFServlet dispatcher by the following line...
http://localhost:8080/PDFServletController?pdfFileToDispatch=az_v1_va_simple_packet.pdf
In this code, the PDF files are referenced from the web root. So when you call the PDF files.. or you in your case your word files, you will need to start at / which will be server/web/. I put my files all under web/pdf/ Inside my code I hide the fact that I'm in 1 directory deeper inside of the pdf directory. Anyway.. give that a shot.. any questions, email me at ddemott@bigfoot.com
Dale
[ August 22, 2002: Message edited by: Dale DeMott ]

By failing to prepare, you are preparing to fail.<br />Benjamin Franklin (1706 - 1790)
Ron Newman
Ranch Hand

Joined: Jun 06, 2002
Posts: 1056
I don't see what browsers or MIME types or servlets have to do with this.
It seems to me like you'd need some kind of filter stream to extract the text from the Word document. Does standard Java provide one, or is there a third-party filter stream class that fills the bill?
[ August 22, 2002: Message edited by: Ron Newman ]

Ron Newman - SCJP 1.2 (100%, 7 August 2002)
Dale DeMott
Ranch Hand

Joined: Nov 02, 2000
Posts: 515
Actually you're right.. I was assuming a web environment..
Sorry...
David Weitzman
Ranch Hand

Joined: Jul 27, 2001
Posts: 1365
There are a lot of people anxiously waiting for the POI project at Jakarta to get a nice interface to MS Office formats, but unfortunately they haven't got much for the Word interface yet.
parag Chatterjee
Greenhorn

Joined: Aug 02, 2002
Posts: 28
yes,
actually POI project is in beta version I hope.
So,right now using pure java(I mean java io) there is no way you can read from a word document.
thanks everybody for their responses.
Parag
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Putting it another way: you can certainly read the bytes using a FileInputStream. If you want to copy them to another location, for example, this is fine. But if you want to interpret them in any meaningful way, there's not really any good way to do that yet.


"I'm not back." - Bill Harding, Twister
Dmitry Shekhter
Greenhorn

Joined: Feb 21, 2001
Posts: 26
i'm not sure where the word doc file is coming from, but if whoever is supplying the file can save it in MicrosoftWord as "plain text", then you don't have this problem.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Is it possible to read a word document using FileReader or FileInputStream?