File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Other Open Source Projects and the fly likes Reading from .doc or .docx file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "Reading from .doc or .docx file" Watch "Reading from .doc or .docx file" New topic
Author

Reading from .doc or .docx file

Sawan Mishra
Ranch Hand

Joined: Oct 24, 2013
Posts: 45

I understand the above program but my problem is reading from
Ms-word(.doc file or .docx file) and writing result to console gives
unexpected output.
How can I read from .doc file and write content to console correctly??

thanks in advance
with regards
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14435
    
  23

Microsoft Word .doc and .docx files are not simple text files that you can read this way with a FileReader.

You'll need a library that understands the specific MS word file formats, such as Apache POI.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 8 API documentation
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42935
    
  68
Those are structured file formats which contain much else besides the plain text. You need to use a library like Apache POI (which can extract the plain text, and also provides an API to get at the structured content).
Paweł Baczyński
Bartender

Joined: Apr 18, 2013
Posts: 1048
    
  17

Don't read doc as a regular text file!
http://stackoverflow.com/questions/7102511/how-read-doc-or-docx-file-in-java


Formely Pawel Pawlowicz
Tony Docherty
Bartender

Joined: Aug 07, 2007
Posts: 2413
    
  50
You need to use a library that understands the format that doc and docx files are saved in. Fortunately there are free libraries available such as POI which can be found at http://poi.apache.org/
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 40052
    
  28
Too difficult for “beginnign”: moving.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Reading from .doc or .docx file