Detect Text Region in Image Using Opencv

How to detect text using OpenCV

Here is one way to do that in Python/OpenCV

  • Read the input
  • Convert to grayscale
  • Threshold
  • Use morphology to remove small white or black regions and to close over the text with white
  • Get the largest vertically oriented rectangle's contour
  • Extract the text from the bounding box of that contour
  • Save results


Input:

Sample Image

import cv2
import numpy as np

# load image
img = cv2.imread("rock.jpg")

# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# threshold image
thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)[1]

# apply morphology to clean up small white or black regions
kernel = np.ones((5,5), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
morph = cv2.morphologyEx(morph, cv2.MORPH_OPEN, kernel)

# thin region to remove excess black border
kernel = np.ones((3,3), np.uint8)
morph = cv2.morphologyEx(morph, cv2.MORPH_ERODE, kernel)

# find contours
cntrs = cv2.findContours(morph, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cntrs = cntrs[0] if len(cntrs) == 2 else cntrs[1]

# Contour filtering -- keep largest, vertically oriented object (h/w > 1)
area_thresh = 0
for c in cntrs:
area = cv2.contourArea(c)
x,y,w,h = cv2.boundingRect(c)
aspect = h / w
if area > area_thresh and aspect > 1:
big_contour = c
area_thresh = area

# extract region of text contour from image
x,y,w,h = cv2.boundingRect(big_contour)
text = img[y:y+h, x:x+w]

# extract region from thresholded image
binary_text = thresh[y:y+h, x:x+w]

# write result to disk
cv2.imwrite("rock_thresh.jpg", thresh)
cv2.imwrite("rock_morph.jpg", morph)
cv2.imwrite("rock_text.jpg", text)
cv2.imwrite("rock_binary_text.jpg", binary_text)

cv2.imshow("THRESH", thresh)
cv2.imshow("MORPH", morph)
cv2.imshow("TEXT", text)
cv2.imshow("BINARY TEXT", binary_text)
cv2.waitKey(0)
cv2.destroyAllWindows()


Thresholded image:

Sample Image

Morphology cleaned image:

Sample Image

Extracted text region image:

Sample Image

Extracted binary text region image:

Sample Image

How to get the location of all text present in an image using OpenCV?

Here's a potential approach using morphological operations to filter out non-text contours. The idea is:

  1. Obtain binary image. Load image, grayscale, then
    Otsu's threshold

  2. Remove horizontal and vertical lines. Create horizontal and vertical kernels using cv2.getStructuringElement() then remove lines with cv2.drawContours()

  3. Remove diagonal lines, circle objects, and curved contours. Filter using contour area cv2.contourArea()
    and contour approximation cv2.approxPolyDP()
    to isolate non-text contours

  4. Extract text ROIs and OCR. Find contours and filter for ROIs then OCR using
    Pytesseract.


Removed horizontal lines highlighted in green

Sample Image

Removed vertical lines

Sample Image

Removed assorted non-text contours (diagonal lines, circular objects, and curves)

Sample Image

Detected text regions

Sample Image

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
clean = thresh.copy()

# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,1))
detect_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detect_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(clean, [c], -1, 0, 3)

# Remove vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,30))
detect_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(detect_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(clean, [c], -1, 0, 3)

cnts = cv2.findContours(clean, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Remove diagonal lines
area = cv2.contourArea(c)
if area < 100:
cv2.drawContours(clean, [c], -1, 0, 3)
# Remove circle objects
elif area > 1000:
cv2.drawContours(clean, [c], -1, 0, -1)
# Remove curve stuff
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
x,y,w,h = cv2.boundingRect(c)
if len(approx) == 4:
cv2.rectangle(clean, (x, y), (x + w, y + h), 0, -1)

open_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
opening = cv2.morphologyEx(clean, cv2.MORPH_OPEN, open_kernel, iterations=2)
close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,2))
close = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, close_kernel, iterations=4)
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
area = cv2.contourArea(c)
if area > 500:
ROI = image[y:y+h, x:x+w]
ROI = cv2.GaussianBlur(ROI, (3,3), 0)
data = pytesseract.image_to_string(ROI, lang='eng',config='--psm 6')
if data.isalnum():
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
print(data)

cv2.imwrite('image.png', image)
cv2.imwrite('clean.png', clean)
cv2.imwrite('close.png', close)
cv2.imwrite('opening.png', opening)
cv2.waitKey()

Detecting cardboard Box and Text on it using OpenCV

Your additional questions warranted a separate answer:

1. How can we be 100% sure if the rectangle drawn is actually on top of a box and not on belt or somewhere else?

  • PRO: For this very purpose I chose the Hue channel of HSV color space. Shades of grey, white and black (on the conveyor belt) are neutral in this channel. The brown color of the box is contrasting could be easily segmented using Otsu threshold. Otsu's algorithm finds the optimal threshold value without user input.
  • CON You might face problems when boxes are also of the same color as conveyor belt

2. Can you please tell me how can I use the function you have provided in original answer to use for other boxes in this new code for video.

  • PRO: In case you want to find boxes using edge detection and without using color information; there is a high chance of getting many unwanted edges. By using extract_rect() function, you can filter contours that:

    1. have approximately 4 sides (quadrilateral)
    2. are above certain area
  • CON If you have parcels/packages/bags that have more than 4 sides you might need to change this.

3. Is it correct way to again convert masked frame to grey, find contours again to draw a rectangle. Or is there a more efficient way to do it.

I felt this is the best way, because all that is remaining is the textual region enclosed in white. Applying threshold of high value was the simplest idea in my mind. There might be a better way :)

(I am not in the position to answer the 4th question :) )



Related Topics



Leave a reply



Submit