aspose file tools*
The moose likes Android and the fly likes Android: looking for app that can be called from another app to extract text from common file types Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Mobile » Android
Bookmark "Android: looking for app that can be called from another app to extract text from common file types" Watch "Android: looking for app that can be called from another app to extract text from common file types" New topic
Author

Android: looking for app that can be called from another app to extract text from common file types

Ilya Zee
Greenhorn

Joined: Nov 22, 2011
Posts: 2

We're writing an app that performs some text processing. It can only work on raw UTF-8 text so we need some means of taking input files in common formats like RTF, DOC, PDF (with text) etc and extracting text from them. This text would then be passed to our app for processing.

I have been trying to port Tika and its parsers to Android with a lot of pain and little luck. A lot of incompatibilities of various 3rd party parser components... Then I remembered that when we wrote a similar app for the desktop a couple of years ago, we actually used OpenOffice (via a macro) to extract text by invoking OO as an external process.

So my question is this: do you know of a tool that (i) is not Tika; (ii) can extract Unicode text from common file formats and (iii) can somehow be forked off by our app so that we can pass the input files to it and then collect the extracted text?

By the way, even though this might be a topic for a separate question, but if you have successfully ported Tika to Android, can you let me know: I still have a glimmer of hope that it can be done.

Thanks much!
Nick Johnson
Greenhorn

Joined: Jun 21, 2012
Posts: 16
The below link may be useful for you, I m not sure. You can try once.....
araxis.com/merge_mac/topic_comparing_text_files.html
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42264
    
  64
Trying to get Tika to work on Android is a hopeless endeavour, I think.

Have you considered performing the text extraction on the server, and putting a REST WS on top of it for the mobile app to access?


Ping & DNS - my free Android networking tools app
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Android: looking for app that can be called from another app to extract text from common file types