aspose file tools*
The moose likes Other Open Source Projects and the fly likes Reading from .doc or .docx file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "Reading from .doc or .docx file" Watch "Reading from .doc or .docx file" New topic
Author

Reading from .doc or .docx file

Sawan Mishra
Ranch Hand

Joined: Oct 24, 2013
Posts: 42

I understand the above program but my problem is reading from
Ms-word(.doc file or .docx file) and writing result to console gives
unexpected output.
How can I read from .doc file and write content to console correctly??

thanks in advance
with regards
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 13875
    
  10

Microsoft Word .doc and .docx files are not simple text files that you can read this way with a FileReader.

You'll need a library that understands the specific MS Word file formats, such as Apache POI.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 7 API documentation
Scala Notes - My blog about Scala
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39549
    
  27
Those are structured file formats which contain much else besides the plain text. You need to use a library like Apache POI (which can extract the plain text, and also provides an API to get at the structured content).


Ping & DNS - updated with new look and Ping home screen widget
Pawel Pawlowicz
Ranch Hand

Joined: Apr 18, 2013
Posts: 596
    
  11

Don't read doc as a regular text file!
http://stackoverflow.com/questions/7102511/how-read-doc-or-docx-file-in-java
Tony Docherty
Bartender

Joined: Aug 07, 2007
Posts: 1945
    
  28
You need to use a library that understands the format that doc and docx files are saved in. Fortunately there are free libraries available such as POI which can be found at http://poi.apache.org/
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 36513
    
  16
Too difficult for “beginnign”: moving.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Reading from .doc or .docx file
 
Similar Threads
Java Programming help needed
Printing from a file
InvalidFormatException
How to extract email-id from a .docx file
Convert .doc file to .txt file