You can't treat a structured file format as if it were text. For DOC that may work by chance, but for DOCX it won't (because its contents are compressed). You need to use a library like Apache POI to get at the content.
Can't .... do .... plaid .... So I did this tiny ad instead:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop