This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
My requirement is to just check if the MS Office 2007 document(docx,xlsx or pptx) is password protected or not. I plan to use apache POI for that.
For doc, xls and ppt documents, I am just trying to create an instance of HWPFDocument, HSSFWorkbook or HSLFSlideShow and checking for the EncryptedDocumentException thrown.
When I try a similar thing for docx, xlsx or pptx document, org.apache.poi.POIXMLException( org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13]") is thrown. I believe that this exception is thrown not only when the file is password protected but also when the file is invalid or corrupt. My guess is that the exception is thrown when trying to access the [Content_Types].xml after unzipping the file(package).
Now. my question again: Is there a more elegant way (other than checking for the org.apache.poi.POIXMLException) to validate if a docx, xlsx or pptx document is password protected?
I don't know if you have been keeping an eye on the above thread, however I have now successully been able to check XLS and XLSX files for passwords, including being able to handle large files within 1-2 seconds with tiny memory footprints.