How to Read Text From an Image with Python
Last Update: Jun 7, 2024
I wrote a book! Check out A Quick Guide to Coding with AI.
Become a super programmer!
Learn how to use Generative AI coding tools as a force multiplier for your career.
If you want to read text from an image with a simple Python script, this tutorial is for you. Thanks to the work of many great people over the last few decades, you can read the text from an image with a few lines of code. Really! Let’s jump in.
What is OCR? Tesseract?
Optical Character Recognition, or OCR has been around for a long time. Its a technique that “reads” different types of documents and into editable and searchable text. It works by recognizing characters in the image and converting them into machine-readable text. It’s a lot of magic but it works well.
Tesseract is an open-source OCR engine developed by Google. It is highly accurate and supports multiple languages. This library will do all the heavy lifting for us. We’ll use it in this tutorial to quickly read the text in some images.
Step 1: Set up your Python Environment
First, you’ll need to make sure Python is installed. We’re going to create a virtual environment.
- How to install Python and set up a virtual environment in Windows
- How to set up your python environment on a Mac
- How to setup Python environment in Linux
I’m using Linux, so I’ll create a directory named textreader
and type in
python -m venv textreader
Then
source textreader/bin/activate
Step 2: Install the Required Libraries
First, we’ll need to install Tesseract on your system. Here’s the instructions to install Tesseract on your chosen operating system.
Make sure Tesseract is installed by typing:
tesseract -v
and you should see output that looks like this:
Then, we’ll install a couple of Python libraries.
Pytesseract is a Python library that is a wrapper for the Tesseract OCR engine. This makes it easy to use in Python applications. We’ll install that and Pillow.
Pillow is the Python Image Library. It’s used for image processing and manipulation. It’s used to pre-process images before applying OCR techniques. It does things like image thresholding and other steps to the image to enhance the accuracy of the reading.
Next, we’ll install Pytesseract and Pillow together for our first application:
pip install pytesseract
pip install pillow
Your output should look something like this:
In some cases, like above, it may say the Requirement is already satisfied for Pillow.
And we’re ready to go.
Step 3: Select your Image
To start out, I’m going to choose something easy. I’ll use a screenshot from my website. This will be clear, easy-to-read text that should work great.
I’ll save that as image-1.jpg in my folder.
Step 4: Write the Script
Now, we’re ready to build our Python script to read the text from that image and output it to the screen.
First, we’ll import the libraries:
import pytesseract
from PIL import Image
Then open the image:
image = Image.open('image-1.jpg')
And then, we’ll use Tesseract to convert the text in the image to a string. Didn’t I say this library does all the heavy lifting for us?
text = pytesseract.image_to_string(image)
Finally, we’ll print it out:
print(text)
Let’s run it and see what it looks like.
Step 5: Watch the Magic Happen
We run our script and get this:
Awesome! So it’s not perfect, but it’s pretty darn good. You can read the text from the image we sent, and it’s somewhat formatted the way it is in the image. That’s awesome!
Congrats! You can now read the text from images in Python. Next, we’ll look at some more advanced stuff.
Learning the Limitations
In our first example, we had a very clear image. The text is formatted and crisp in that image, so it’s easy to read. Let’s step it up a bit.
I picked a more challenging image, one from Pexels, that isn’t quite so easy.
Let’s see what the output is when reading this image:
Oof. Nothing. I included this because it’s important to know the limitations of this process. Unusual fonts and different angles will affect how well this works. There isn’t much we can do to read this image without some extensive work.
Conclusion
In this tutorial, we learned how to use Tesseract to read text from an image and put it into a machine-readable form. We can read many other things with OCR, and we’ll deep dive into some of this stuff in future articles.
Feel free to play around with this and see what you can come up with! In a future tutorial, we’ll use OpenCV to refine things and do more pre-processing of the images we’ll read from. It will be fun.
Bookmark this blog and come back for more cool Python tutorials.
Questions? Comments? Yell at me!