Has anyone indexed Office 2007 documents into lucene?
Im trying to index a docx document and allow full text searching of it. The indexing is working for *.doc documents but FT doesn't seem to work for *.docx documents.
If someone could point in the in right direction I would appreciate it.
Ulf Dittmer
Marshal
Joined: Mar 22, 2005
Posts: 35241
7
posted
0
Which library are you using for reading DOC files - Apache POI? If so, that doesn't support the XML-based Office files yet. You could try the beta version of POI, which does support DOCX to a certain degree.