This repo shows how to extract information from forms saved as images. For example,
and get structured data from it:
# Data Output
{'cargo': 'AGENTE FISCAL 2014-08-01',
'ciudad': '‘Quito',
'civil': '‘SOLTERO',
'desde': '2014-08-01',
'gestion': 'INICIO DE GESTION x PERIODICA FIN DE GESTION ',
'hasta': '',
'institucion': 'FISCAIA GENERAL DEL ESTADO, AGENTE ‘4. INFORMACION PATRIMONIAL',
'prov': 'PICHINCHA,'}
I show examples using two OCR agents Tesseract and Google Cloud Vision.