Hajek, Bretislav (School: Gymnazium Cesky Brod)
This project creates OCR software for recognition of cursive handwriting from page photos. The initial goal was to create software for converting handwritten notes into digital format. Such software could be used in schools, archives, or during forms analysis. The program was implemented in four main steps: background removal, detection of words, normalization and recognition of words. The first three steps take advantage of computer vision algorithms, and the last step performs word recognition using machine learning. The main focus was put into recognition of words. Different machine learning models were trained on a dataset containing 5,000 words. A method which first separates letters and then recognizes them in combination with a spelling corrector reached the highest (80.0%) accuracy. Based on the best performing method and a sequence-to-sequence model, a new machine learning model was created. This model can be trained on images of whole words and has comparable performance to the connectionist temporal classification model, reaching up to 76.2% during testing. The new model has the best performance on input images split into a sequence of two-pixel-wide segments, and it did not need further tuning of the input segments size as a CTC model. The first three steps work well if the input premises are followed. The model dividing recognition of words into two steps reached the highest accuracy, but it requires a more complex dataset. The new model or CTC model has a great advantage because training can be conducted on images of whole words.