aspose file tools
The moose likes Other Open Source Projects and the fly likes lucene and office 2007 Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


Win a copy of The Mikado Method this week in the Agile and other Processes forum!
JavaRanch » Java Forums » Products » Other Open Source Projects
Reply Bookmark "lucene and office 2007" Watch "lucene and office 2007" New topic
Author

lucene and office 2007

Seamus Minogue
Ranch Hand

Joined: Jun 24, 2008
Posts: 41
Has anyone indexed Office 2007 documents into lucene?

Im trying to index a docx document and allow full text searching of it. The indexing is working for *.doc documents but FT doesn't seem to work for *.docx documents.

If someone could point in the in right direction I would appreciate it.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 35241
    
    7
Which library are you using for reading DOC files - Apache POI? If so, that doesn't support the XML-based Office files yet. You could try the beta version of POI, which does support DOCX to a certain degree.


Android appsImageJ pluginsJava web charts
Seamus Minogue
Ranch Hand

Joined: Jun 24, 2008
Posts: 41
Ill take a look at that :-) Thanks
 
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to run our stuff on 16 servers instead of 3.
 
subject: lucene and office 2007
 
Similar Threads
cannot open office2007 documents from the java application
Magic number for Microsoft 2007 files
Print MS Office Documents using Java API
Java API for word files
how to check if a docx, xlsx, or pptx file is password protected using apache POI?