AI4Good - OCR Challenge

description

Inspiration

OCR for Digitizing Health Records

What it does

Challenge Description

Charitable Analytics International (CAI) is a Montreal-based NGO that is working to improve the digitization of health records. They partner with organizations that operate in over 80 countries, where they deliver much-needed supplies to health clinics, schools, refugee camps, etc. The areas where they operate often lack access to electricity, network and staff with computer knowledge, so paper is the only form of bookkeeping possible. Currently, the paper logbooks are transported back to the city where the staff manually tally up every page. This process is tedious, expensive and prone to errors.

Challenge Approach

Meza, the web platform created by Charitable Analytics, allows images of logbooks to be sent from the field, where they are currently hand-labelled. You can use the training dataset to build your model Test your approach on the validation set and send your results to the Meza web server to see how well you have done!

Dataset

Training/Test set ~ 8,000 labeled cells Validation set ~ 1,000 labeled cells Format : MNIST (can be loaded using PyTorch, Keras, etc) The images contain values expressed by this regex: ^[><]?[+-]?[0-9][.\,]?[0-9]$

How we built it

We started from the pytorch tutorial on image captioning and tried to adapt it for the OCR problem. This model was composed of an encoder: resnet152 whose last layer has been removed and a decoder using LSTM. We replaced the encoder with LeNet5 with the dense layers modified. We played with the learning rate and the batch size, but the model could not go beyond 17% of accuracy on our early stopping validation set.

We used vast.ai to train the models.

Challenges we ran into

We haven't had time to modify the base model as much as we would have liked too. Image captioning is quite different from OCR and we think more changes in the architecture would have been necessary.

Accomplishments that we're proud of

We are not proud of our results. However, we are proud of the progress we made in understanding deep neural networks.

What we learned

We learned a lot during this hackathon, it was the first time we were working on image to sequence, and for some of us the first time using PyTorch. We gain a lot in efficiency during his hackathon by getting more familiar with PyToch API and vast.ai.