aspose file tools*
The moose likes Java in General and the fly likes Reading tiff file content Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Java 8 in Action this week in the Java 8 forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Reading tiff file content" Watch "Reading tiff file content" New topic
Author

Reading tiff file content

Dinesh Pise
Greenhorn

Joined: Jul 18, 2012
Posts: 27
Hi Friends,

I have a tiff file images which are the scanned documents. I have to read the text content for eg barcode, name of applicant, dob etc. Can someone please help with this. Once again my tiff files are scanned images.

Regards,
Dinesh Pise
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 3968
    
  17

Hi Dinesh, What you need is to translate images into text, this is done via 'OCR' or 'Optical Character Recognition'. There is no built-in library to do that, so you will need to find an OCR library to help you do the work.


Steve
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39576
    
  27
The best-known Java library for OCR is called Tesseract, you'll find that easily.

There are separate libraries for detecting barcodes; searching for "java barcode detection" or some such phrase will find them.


Ping & DNS - updated with new look and Ping home screen widget
Dinesh Pise
Greenhorn

Joined: Jul 18, 2012
Posts: 27
Ulf Dittmer wrote:The best-known Java library for OCR is called Tesseract, you'll find that easily.

There are separate libraries for detecting barcodes; searching for "java barcode detection" or some such phrase will find them.


Hi Ulf Dittmer,

Can you please provide me some link where I can get some more details of Tesseract as how to implement in java/ how to use it in java. I have googled for Tesseract but did'nt succeed in understanding it.

Thank you.

Regards,
Dinesh Pise
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39576
    
  27
I've never used Tesseract, so I can't help. But I notice that there's an extensive FAQ on the site, and it also has forums. Those should get you going.
Dinesh Pise
Greenhorn

Joined: Jul 18, 2012
Posts: 27
Hi,

I have done image reading through aspriseOCR.jar and aspriseTIFF.jar but this is paid version. Is Tesseract a free for commercial use.
I am posting the code below which I used to read tiff content and also we have to put to DevIL.dll,ILU.dll & AspriseOCR.dll in windows/system32.




Dinesh Pise
Greenhorn

Joined: Jul 18, 2012
Posts: 27
Hi Friends,

This time I am trying to read tiff image content through tesseract. I have download tesseract.exe and install. I am running the tesseract using command prompt and got the following error.

Tesseract Open Source OCR Engine v3.02 with Leptonica
Cannot open input file: 700466293_00000001.tif

now my query is how to install leptonica, I have downloaded leptonica-1.68-win32-lib-include-dirs.zip, can some one please tell how to implement this with tesseract


I have referred Tesseract

Thanks & regards,
Dinesh Pise
Dinesh Pise
Greenhorn

Joined: Jul 18, 2012
Posts: 27
Hi Friends,
The below command is working fine to read tif file and will generate out.txt having tiff text content.
Actually I was giving wrong file name and was Tesseract Open Source OCR Engine v3.02 with Leptonica error.


C:\images\tesseract 700466296_00000002.TIF out

Thanks & regards,
Dinesh Pise
 
Don't get me started about those stupid light bulbs.
 
subject: Reading tiff file content
 
Similar Threads
Query regarding JAI API for TIFF images.
Reading TIFF files pagewise
How to read a text from an image
Read character from the tiff image file
64-bit Windows TI reader?