I have a tiff file images which are the scanned documents. I have to read the text content for eg barcode, name of applicant, dob etc. Can someone please help with this. Once again my tiff files are scanned images.
Hi Dinesh, What you need is to translate images into text, this is done via 'OCR' or 'Optical Character Recognition'. There is no built-in library to do that, so you will need to find an OCR library to help you do the work.
Ulf Dittmer wrote:The best-known Java library for OCR is called Tesseract, you'll find that easily.
There are separate libraries for detecting barcodes; searching for "java barcode detection" or some such phrase will find them.
Hi Ulf Dittmer,
Can you please provide me some link where I can get some more details of Tesseract as how to implement in java/ how to use it in java. I have googled for Tesseract but did'nt succeed in understanding it.
Joined: Mar 22, 2005
I've never used Tesseract, so I can't help. But I notice that there's an extensive FAQ on the site, and it also has forums. Those should get you going.
Joined: Jul 18, 2012
I have done image reading through aspriseOCR.jar and aspriseTIFF.jar but this is paid version. Is Tesseract a free for commercial use.
I am posting the code below which I used to read tiff content and also we have to put to DevIL.dll,ILU.dll & AspriseOCR.dll in windows/system32.
Joined: Jul 18, 2012
This time I am trying to read tiff image content through tesseract. I have download tesseract.exe and install. I am running the tesseract using command prompt and got the following error.
Tesseract Open Source OCR Engine v3.02 with Leptonica
Cannot open input file: 700466293_00000001.tif
now my query is how to install leptonica, I have downloaded leptonica-1.68-win32-lib-include-dirs.zip, can some one please tell how to implement this with tesseract
The below command is working fine to read tif file and will generate out.txt having tiff text content.
Actually I was giving wrong file name and was Tesseract Open Source OCR Engine v3.02 with Leptonica error.