Next Generation OCR using Machine Learning

alpha2phi
3 min readSep 19, 2021

Next-generation character recognition using machine learning.

Photo by Romain Vignes on Unsplash

Overview

In this article, I am going to develop a React + FastAPI application that captures images from the webcam and uses an OCR-trained model to detect the text.

PaddleOCR is the OCR toolkits that I use. From my testing, it is a lightweight and high-performing ML OCR library. It supports over 80 languages and can be deployed in server, mobile, embedded, and IoT devices using languages like Python, C++, etc.

The Application

The application captures the image using the webcam and invokes the backend API to recognize the text.

Regions, where the characters are detected, are boxed, and the results are displayed with the probabilities.

Character Recognition Application

More examples are shown below.

News and Document

News and Document

Invoice

Invoice

Number Plate

Number Plate Recognition

CAPTCHA

CAPTCHA

Custom Font

Custom Font

The Code

Backend

The backend APIs are developed using FastAPI. WebSocket is used to send and receive data.

alpha2phi

Software engineer, Data Science and ML practitioner.