Systems Engineering and RDBMS

A new step in OCR: Google’s Answer

Posted by decipherinfosys on November 1, 2008

Read this on the Google post today about their OCR (Optical Character Recognition) technology using which their search engine can now read any scanned documents that are scanned and saved in Adobe’s PDF format. So, the scanned images of the words and pictures can now be indexed and made searchable. The link contains the explanation of how searching within indexed documents is different so we won’t go into that. It does require a lot of processing power since scanned documents do not contain any text data that spiders can index.

We use our own OCR parser for our DVMS (Decipher Vaccine Management System) and Business Intelligence Suite for the healthcare product and it is only 75% accurate since hand written text by the doctors/nurses is hardly recognizable to even the human eye 🙂 let alone the machine code. So, we also eagerly look forward to this approach and if that can be used within our product to improve the accuracy of the data, that will be very good.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: