• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Tim Cooke
  • Devaka Cooray
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
Bartenders:
  • Carey Brown
  • Roland Mueller

OCR Methodology

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am trying to figure out how to use OCR from the ground up (and using java somewhere in the back-end). Our current system is as follows:
1) Scan multiple documents (using a Canon 5020) and save as a PDF format (usually several hundred of the same type of document - each one for a different person).
2) Using a java Swing GUI, the user opens each PDF document and assigns a type to it and a few other parameters). The system then takes the PDF and stores it in an appropriate location and enters appropriate database information (always stored as a PDF).

I want to skip the user section and automatate with OCR. Not even sure where to start. Should I scan the documents and save as PDF and then use some OCR program to read through the PDF or should it be saved as some other format and converted later. What are some good tools, etc...

Thanks for any help!
 
These are the worst of times and these are the best of times. And this is the best tiny ad:
We need your help - Coderanch server fundraiser
https://coderanch.com/wiki/782867/Coderanch-server-fundraiser
reply
    Bookmark Topic Watch Topic
  • New Topic