Build a Web-based Text Recognition System

alpha2phi
3 min readFeb 21, 2021
Photo by Markus Spiske on Unsplash

Overview

In a previous article, I walked through with you how to build a real-time object detection system using YOLOv5. In this article let’s build a web-based text recognition system using Tesseract OCR, which is an open-source optical character recognition engine for various operating systems.

Tesseract

I am going to use Tesseract version 4.1. The default installed language is English but you can install it for other supported languages as well. The installation differs for each operating system and you can refer to the documentation.

Tesseract Command Line
Tesseract Support Languages

The Application

  • The web application uses a camera to capture the photo and sends the image to the backend API using WebSocket.
  • The API uses Tesseract to extract text from the image and sends the output back to the web application.

As you can see the quality of the image impacts the results. However, Tesseract does a pretty decent job of extracting the information.

Real-time Text Recognition
Real-Time Text Recognition

The Code

Front-End

The front-end is a simple React application that uses a camera to capture the photo. When the application starts, it connects to the backend using WebSocket.

The Backend

--

--

alpha2phi

Software engineer, Data Science and ML practitioner.