Pytesseract Ocr Multiple Config Options

Pytesseract OCR multiple config options

tesseract-4.0.0a supports below psm. If you want to have single character recognition, set psm = 10. And if your text consists of numbers only, you can set tessedit_char_whitelist=0123456789.

Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
                        bypassing hacks that are Tesseract-specific.

Here is a sample usage of image_to_string with multiple parameters.

target = pytesseract.image_to_string(image, lang='eng', boxes=False, \
        config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

Hope this helps.

Pytesseract OCR doesn't recognize the digits

One way of solving is using inRange thresholding

The result will be:

pytesseract improving OCR accuracy for blurred numbers on an image

Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, then apply a sharpening kernel using cv2.filter2D() to enhance the blurred sections. A general sharpening kernel looks like this:

[[-1,-1,-1], [-1,9,-1], [-1,-1,-1]]

Other kernel variations can be found here. Depending on the image, you can adjust the strength of the filter. From here we Otsu's threshold to obtain a binary image then perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more OCR configuration options.

Here's a visualization of the image processing pipeline:

Input image

Sample Image

Convert to grayscale -> apply sharpening filter

Sample Image

Otsu's threshold

Sample Image

Result from Pytesseract OCR

124,685

Code

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, apply sharpening filter, Otsu's threshold 
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.waitKey()

Pytesseract-OCR wrong japanese detection

tesseract as default uses only English and you may have to set other language(s) as parameter.

At console you can test it as

tesseract.exe image.png output.txt -l jpn

or even with many languages

tesseract.exe image.png output.txt -l jpn+eng

(instead of output.txt you can use - to display text directly in console)

In code it can be

pytesseract.image_to_string('image.png', config='-l jpn')

pytesseract.image_to_string('image.png', config='-l jpn+eng')

pytesseract.image_to_string('image.png', lang='jpn')

pytesseract.image_to_string('image.png', lang='jpn+eng')

Pytesseract Ocr Multiple Config Options