• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Reading from .doc or .docx file

 
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

I understand the above program but my problem is reading from
Ms-word(.doc file or .docx file) and writing result to console gives
unexpected output.
How can I read from .doc file and write content to console correctly??

thanks in advance
with regards
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Microsoft Word .doc and .docx files are not simple text files that you can read this way with a FileReader.

You'll need a library that understands the specific MS Word file formats, such as Apache POI.
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Those are structured file formats which contain much else besides the plain text. You need to use a library like Apache POI (which can extract the plain text, and also provides an API to get at the structured content).
 
Bartender
Posts: 2236
63
IntelliJ IDE Firefox Browser Spring Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Don't read doc as a regular text file!
http://stackoverflow.com/questions/7102511/how-read-doc-or-docx-file-in-java
 
Bartender
Posts: 3323
86
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You need to use a library that understands the format that doc and docx files are saved in. Fortunately there are free libraries available such as POI which can be found at http://poi.apache.org/
 
Marshal
Posts: 79239
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Too difficult for “beginnign”: moving.
 
reply
    Bookmark Topic Watch Topic
  • New Topic