Last Update: Jun 7, 2024

Coding with AI

I wrote a book! Check out A Quick Guide to Coding with AI.
Become a super programmer!
Learn how to use Generative AI coding tools as a force multiplier for your career.


If you want to read text from an image with a simple Python script, this tutorial is for you. Thanks to the work of many great people over the last few decades, you can read the text from an image with a few lines of code. Really! Let’s jump in.

How to read text from an image with Python

What is OCR? Tesseract?

Optical Character Recognition, or OCR has been around for a long time. Its a technique that “reads” different types of documents and into editable and searchable text. It works by recognizing characters in the image and converting them into machine-readable text. It’s a lot of magic but it works well.

Tesseract is an open-source OCR engine developed by Google. It is highly accurate and supports multiple languages. This library will do all the heavy lifting for us. We’ll use it in this tutorial to quickly read the text in some images.

Step 1: Set up your Python Environment

First, you’ll need to make sure Python is installed. We’re going to create a virtual environment.

I’m using Linux, so I’ll create a directory named textreader and type in

python -m venv textreader

Then

source textreader/bin/activate

Step 2: Install the Required Libraries

First, we’ll need to install Tesseract on your system. Here’s the instructions to install Tesseract on your chosen operating system.

Make sure Tesseract is installed by typing:

tesseract -v

and you should see output that looks like this:

How to read text from an image with Python

Then, we’ll install a couple of Python libraries.

Pytesseract is a Python library that is a wrapper for the Tesseract OCR engine. This makes it easy to use in Python applications. We’ll install that and Pillow.

Pillow is the Python Image Library. It’s used for image processing and manipulation. It’s used to pre-process images before applying OCR techniques. It does things like image thresholding and other steps to the image to enhance the accuracy of the reading.

Next, we’ll install Pytesseract and Pillow together for our first application:

pip install pytesseract
pip install pillow

Your output should look something like this:

How to read text from an image with Python

In some cases, like above, it may say the Requirement is already satisfied for Pillow.

And we’re ready to go.

Step 3: Select your Image

To start out, I’m going to choose something easy. I’ll use a screenshot from my website. This will be clear, easy-to-read text that should work great.

How to read text from an image with Python

I’ll save that as image-1.jpg in my folder.

Step 4: Write the Script

Now, we’re ready to build our Python script to read the text from that image and output it to the screen.

First, we’ll import the libraries:

import pytesseract
from PIL import Image

Then open the image:

image = Image.open('image-1.jpg')

And then, we’ll use Tesseract to convert the text in the image to a string. Didn’t I say this library does all the heavy lifting for us?

text = pytesseract.image_to_string(image)

Finally, we’ll print it out:

print(text)

Let’s run it and see what it looks like.

Step 5: Watch the Magic Happen

We run our script and get this:

How to read text from an image with Python

Awesome! So it’s not perfect, but it’s pretty darn good. You can read the text from the image we sent, and it’s somewhat formatted the way it is in the image. That’s awesome!

Congrats! You can now read the text from images in Python. Next, we’ll look at some more advanced stuff.

Learning the Limitations

In our first example, we had a very clear image. The text is formatted and crisp in that image, so it’s easy to read. Let’s step it up a bit.

I picked a more challenging image, one from Pexels, that isn’t quite so easy.

How to read text from an image with Python

Let’s see what the output is when reading this image:

How to read text from an image with Python

Oof. Nothing. I included this because it’s important to know the limitations of this process. Unusual fonts and different angles will affect how well this works. There isn’t much we can do to read this image without some extensive work.

Conclusion

In this tutorial, we learned how to use Tesseract to read text from an image and put it into a machine-readable form. We can read many other things with OCR, and we’ll deep dive into some of this stuff in future articles.

Feel free to play around with this and see what you can come up with! In a future tutorial, we’ll use OpenCV to refine things and do more pre-processing of the images we’ll read from. It will be fun.

Bookmark this blog and come back for more cool Python tutorials.

Questions? Comments? Yell at me!



Stay up to date on the latest in Computer Vision and AI.

Get notified when I post new articles!

Intuit Mailchimp




Published: Oct 11, 2023 by Jeremy Morgan. Contact me before republishing this content.