• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

Apache Tika

 
Bartender
Posts: 1973
17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a running Spring REST service that uses the Tika libraries.

Currently, these Tika libraries are 2.9.1.

At some point while adding other features (perhaps this is the issue), the DOCX portion of the Tika extract stopped working. The same basic code in a standalone (not REST) Maven project works OK. so I'm a bit baffled.

Having spent about 8 hours on this, I thought I'd ask the community if anyone had run across this issue with DOCX files in Tika.

The error generated in the SpringBoot REST service is: "TIKA-198: Illegal IOException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser"

Below is the test code that works in the standalone Spring project and is the same code spread across Spring's Controller and Service methods but doesn't work.



In the REST project where the error occurs, the data are "POSTed" -- not referenced with a disk path, but XLSX, PDF, TXT all work fine. It's just DOCX that is failing.

Thanks in advance for any suggestions.

- mike
 
Saloon Keeper
Posts: 28319
210
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

You really shouldn't write straight to System.out in a webapp. Even if it's in Spring Boot. Use a real logger. Like Spring itself does.

You also don't need to trap all those Exceptions individually, since they're all Exceptions anyway.

The log example I gave should (depending on the logger you use) print out the error message and the stack trace. The stack trace may have a useful "Caused By" in it.
 
Mike London
Bartender
Posts: 1973
17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That was just an example program to see if the libraries would work.

I wasn't worried about the System.out.println.

But they do work since it's basically a console Spring (TEST) application.

The real SpringBoot uses Log4j, of course.

I don't need help with logging, and that was not my question.

I also posted the actual error received by the more detailed logging.

Thanks Tim.
 
Tim Holloway
Saloon Keeper
Posts: 28319
210
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Mike London wrote:
I also posted the actual error received by the more detailed logging.

Stack Trace???

Just for info, mixing raw stdio can get confusing, since not everything may print in order.
 
and POOF! You're gone! But look, this tiny ad is still here:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic