Skip to content Skip to sidebar Skip to footer

Transform An Image Of Handwritten Notes To Text

I have hundreds of images of handwritten notes. They were written from different people but they are in sequence so you know that for example person1 wrote img1.jpg -> img100.jp

Solution 1:

This is called OCR and there has been a progress. Actually, here is an example of how simple it is to parse an image file to text using tesseract:

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract


defocr_core(file):
    text = pytesseract.image_to_string(file)
    return text


print(ocr_core('sample.png'))

BUT

I am not very sure that it can recognize different types of handwriting. You can give it a try yourself to find out. If you want to try the python example you need to import tesseract but first things first to install tesseract on your OS and add it to your PATH.

Solution 2:

There are many OCRs out there and some perform better than others. However, this is a field that has improved a lot recently with the Deep Neural Networks. I would consider using a Cloud provider such as Azure, Google Cloud or Amazon. Your upload the image and they return the metadata.

For instance: https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/

If you don't want to use cloud services for any reason, I would consider using TensorFlow... but some knowledge is required:

Tensorflow model for OCR

Post a Comment for "Transform An Image Of Handwritten Notes To Text"