Installing Tesseract, PyTesseract, and Python OCR packages on your system

In this tutorial, we will configure our development environment for OCR. Once your machine is configured, we’ll start writing Python code to perform OCR, paving the way for you to develop your own OCR applications.

To learn how to configure your development environment, just keep reading.

Learning Objectives

In this tutorial, you will:

Learn how to install the Tesseract OCR engine on your machine
Learn how to create a Python virtual environment (a best practice in Python development)
Install the necessary Python packages you need to run the examples in this tutorial (and develop OCR projects of your own)

OCR Development Environment Configuration

In the first part of this tutorial, you will learn how to install the Tesseract OCR engine on your system. From there, you’ll learn how to create a Python virtual environment and then install OpenCV, PyTesseract, and all the other necessary Python libraries you’ll need for OCR, computer vision, and deep learning.

A Note on Install Instructions

The Tesseract OCR engine has existed for over 30 years. The install instructions for Tesseract OCR are fairly stable. Therefore I have included the steps.

With that said, let’s install the Tesseract OCR engine on your system!

Installing Tesseract

Inside this tutorial, you will learn how to install Tesseract on your machine.

Installing Tesseract on macOS

Installing the Tesseract OCR engine on macOS is quite simple if you use the Homebrew package manager.

Use the link above to install Homebrew on your system if it is not already installed.

From there, all you need to do is use the brew command to install Tesseract:

 $ brew install tesseract

Provided that the above command does not exit with an error, you should now have Tesseract installed on your macOS machine.

Installing Tesseract on Ubuntu

Installing Tesseract on Ubuntu 18.04 is easy — all we need to do is utilize apt-get:

 $ sudo apt install tesseract-ocr

The apt-get package manager will automatically install any prerequisite libraries or packages required for Tesseract.

Installing Tesseract on Windows

Please note that the PyImageSearch team and I do not officially support Windows, except for customers who use our pre-configured Jupyter/Colab Notebooks, which you can find at PyImageSearch University. These notebooks run on all environments, including macOS, Linux, and Windows.

We instead recommend using a Unix-based machine such as Linux/Ubuntu or macOS, both of which are better suited for developing computer vision, deep learning, and OCR projects.

That said, if you wish to install Tesseract on Windows, we recommend that you follow the official Windows install instructions put together by the Tesseract team.

Verifying Your Tesseract Install

Provided that you were able to install Tesseract on your operating system, you can verify that Tesseract is installed by using the tesseract command:

 $ tesseract -v
 tesseract 4.1.1
  leptonica-1.79.0
   libgif 5.2.1 : libjpeg 9d : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 1.1.0 : libopenjp2 2.3.1
  Found AVX2
  Found AVX
  Found FMA
  Found SSE

Your output should look similar to mine.

Creating a Python Virtual Environment for OCR

Python virtual environments are a best practice for Python development, and we recommend using them to have more reliable development environments.

Installing the necessary packages for Python virtual environments, as well as creating your first Python virtual environment, can be found in our pip Install OpenCV tutorial. We recommend you follow that tutorial to create your first Python virtual environment.

Installing OpenCV and PyTesseract

Now that you have your Python virtual environment created and ready, we can install both OpenCV and PyTesseract, the Python package that interfaces with the Tesseract OCR engine.

Both of these can be installed using the following commands:

 $ workon <name_of_your_env> # required if using virtual envs
 $ pip install numpy opencv-contrib-python
 $ pip install pytesseract

Next, we’ll install other Python packages we’ll need for OCR, computer vision, deep learning, and machine learning.

Installing Other Computer Vision, Deep Learning, and Machine Learning Libraries

Let’s now install some other supporting computer vision and machine learning/deep learning packages that we’ll need throughout the rest of this tutorial:

 $ pip install pillow scipy
 $ pip install scikit-learn scikit-image
 $ pip install imutils matplotlib
 $ pip install requests beautifulsoup4
 $ pip install h5py tensorflow textblob

What's next? I recommend PyImageSearch University.

Course information:
25 total classes • 37h 19m video • Last updated: 7/2021
★★★★★ 4.84 (128 Ratings) • 10,597 Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

&check; 25 courses on essential computer vision, deep learning, and OpenCV topics
&check; 25 Certificates of Completion
&check; 37h 19m on-demand video
&check; Brand new courses released every month, ensuring you can keep up with state-of-the-art techniques
&check; Pre-configured Jupyter Notebooks in Google Colab
&check; Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
&check; Access to centralized code repos for all 400+ tutorials on PyImageSearch
&check; Easy one-click downloads for code, datasets, pre-trained models, etc.
&check; Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this tutorial, you learned how to install the Tesseract OCR engine on your machine. You also learned how to install the required Python packages you will need to perform OCR, computer vision, and image processing.

Now that your development environment is configured, we will write an OCR code in our next tutorial!

Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF

Enter your email address below to join the PyImageSearch Newsletter and download my FREE 17-page Resource Guide PDF on Computer Vision, OpenCV, and Deep Learning.

The post Installing Tesseract, PyTesseract, and Python OCR packages on your system appeared first on PyImageSearch.

Installing Tesseract, PyTesseract, and Python OCR packages on your system

Learning Objectives

OCR Development Environment Configuration

A Note on Install Instructions

Installing Tesseract

Installing Tesseract on macOS

Installing Tesseract on Ubuntu

Installing Tesseract on Windows

Verifying Your Tesseract Install

Creating a Python Virtual Environment for OCR

Installing OpenCV and PyTesseract

Installing Other Computer Vision, Deep Learning, and Machine Learning Libraries

What's next? I recommend PyImageSearch University.

Summary

Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF

Trending Articles

ESENT データベース USS.jtx で、エラーイベント ID 490、454、489、455 が記録される事象について

Revised GDS Gratuity, Severance Amount and SDBS contribution - Social...

Password Reset on SX6036?

the range cannot be deleted (6028) in microsoft word

Name Of Parts Of The Day In hindi And English-List Of Part Of Days In Hindi

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

Man to stand trial on three charges of money laundering

Joshua Pigden from Bristol faces trial over rape and Diazepam...

DRP MAKER WITH CHEMICALS 9491234553

Bhiknur Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers List...

Moondru Mudichu 27-05-2016 – Polimer tv Serial

Snes4Sym emulator for nokia s60v3

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

Throw Back: Samini — Where My Baby Dey (Prod by Kaywa)

Muloraki Au

Chai Status, Funny Tea Quotes in Hindi, चाय पर शायरी

PRC MOE SCHOOL TEACHER CHARGED FOR SEXUALLY PENETRATING 12 YEAR-OLD WITH FINGERS

Nahitaji matokeo ya kidato cha nne ya mwaka 1998

Practice Sheet of Right form of verbs for HSC Students

Class 6 Science Chapter 2 Question Answer in Hindi सजीव जगत में विविधता