aspose file tools*
The moose likes Other Open Source Projects and the fly likes lucene and office 2007 Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "lucene and office 2007" Watch "lucene and office 2007" New topic
Author

lucene and office 2007

Seamus Minogue
Ranch Hand

Joined: Jun 24, 2008
Posts: 41
Has anyone indexed Office 2007 documents into lucene?

Im trying to index a docx document and allow full text searching of it. The indexing is working for *.doc documents but FT doesn't seem to work for *.docx documents.

If someone could point in the in right direction I would appreciate it.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42608
    
  65
Which library are you using for reading doc files - Apache POI? If so, that doesn't support the XML-based Office files yet. You could try the beta version of POI, which does support DOCX to a certain degree.


Ping & DNS - my free Android networking tools app
Seamus Minogue
Ranch Hand

Joined: Jun 24, 2008
Posts: 41
Ill take a look at that :-) Thanks
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: lucene and office 2007