Pytesseract OCR multiple config options
tesseract-4.0.0a
supports below psm
. If you want to have single character recognition, set psm = 10
. And if your text consists of numbers only, you can set tessedit_char_whitelist=0123456789
.
Page segmentation modes:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line,
bypassing hacks that are Tesseract-specific.
Here is a sample usage of image_to_string
with multiple parameters.
target = pytesseract.image_to_string(image, lang='eng', boxes=False, \
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
Hope this helps.
Pytesseract OCR doesn't recognize the digits
One way of solving is using inRange thresholding
The result will be:
pytesseract improving OCR accuracy for blurred numbers on an image
Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, then apply a sharpening kernel using cv2.filter2D()
to enhance the blurred sections. A general sharpening kernel looks like this:
[[-1,-1,-1], [-1,9,-1], [-1,-1,-1]]
Other kernel variations can be found here. Depending on the image, you can adjust the strength of the filter. From here we Otsu's threshold to obtain a binary image then perform text extraction using the --psm 6
configuration option to assume a single uniform block of text. Take a look here for more OCR configuration options.
Here's a visualization of the image processing pipeline:
Input image
Convert to grayscale ->
apply sharpening filter
Otsu's threshold
Result from Pytesseract OCR
124,685
Code
import cv2
import numpy as np
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, apply sharpening filter, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)
cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.waitKey()
Pytesseract-OCR wrong japanese detection
tesseract
as default uses only English
and you may have to set other language(s) as parameter.
At console you can test it as
tesseract.exe image.png output.txt -l jpn
or even with many languages
tesseract.exe image.png output.txt -l jpn+eng
(instead of output.txt
you can use -
to display text directly in console)
In code it can be
pytesseract.image_to_string('image.png', config='-l jpn')
pytesseract.image_to_string('image.png', config='-l jpn+eng')
or
pytesseract.image_to_string('image.png', lang='jpn')
pytesseract.image_to_string('image.png', lang='jpn+eng')
Related Topics
How to Declare Custom Exceptions in Modern Python
Difference Between Python'S Generators and Iterators
How Does Collections.Defaultdict Work
How to Convert Index of a Pandas Dataframe into a Column
Find the Similarity Metric Between Two Strings
Behaviour of Increment and Decrement Operators in Python
How to Check File Size in Python
How to Get the Last Day of the Month
How to Get the Day of Week Given a Date
Background Thread With Qthread in Pyqt
Convert Timestamps With Offset to Datetime Obj Using Strptime
Oserror 38 [Errno 38] with Multiprocessing
Using Python 32 Bit in 64Bit Platform
Error with Igraph Library - Deprecated Library
How to Determine Pid of Process Started via Os.System
What Are the Tkinter Events for Horizontal Edge Scrolling (In Linux)
Paramiko Error: Error Reading Ssh Protocol Banner
Python Script Is Not Running Under Cron, Despite Working When Run Manually