How to detect Text Area from image?
Take a look at this bounding box technique demonstrated with OpenCV code:
Input:
Eroded:
Result:
Improve text area detection (OpenCV, Python)
Solved using the following code.
import cv2
# Load the image
img = cv2.imread('image.png')
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# smooth the image to avoid noises
gray = cv2.medianBlur(gray,5)
# Apply adaptive threshold
thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)
thresh_color = cv2.cvtColor(thresh,cv2.COLOR_GRAY2BGR)
# apply some dilation and erosion to join the gaps - change iteration to detect more or less area's
thresh = cv2.dilate(thresh,None,iterations = 15)
thresh = cv2.erode(thresh,None,iterations = 15)
# Find the contours
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
# For each contour, find the bounding rectangle and draw it
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
cv2.rectangle(thresh_color,(x,y),(x+w,y+h),(0,255,0),2)
# Finally show the image
cv2.imshow('img',img)
cv2.imshow('res',thresh_color)
cv2.waitKey(0)
cv2.destroyAllWindows()
Parameters that need to be modified to obtain the result below is numbers of iterations in erode
and dilate
functions.
Lower values will create more bounding rectangles around (nearly) every digit/character.
Result
How to check if a image has any text or not?
There is indeed simple way with opencv and pytessaract after installing you will only need to use a few lines in order to get the text
pip install opencv-python
pip install pytesseract
import cv2
import pytesseract
img = cv2.imread('yourimage.jpeg')
text = pytesseract.image_to_string(img)
Read Text from Image with One Line of Python Code
Also if you don't like the first way you can use Google vision, keep in mind it will return Json and you will extract what you need.
https://cloud.google.com/vision/docs/ocr
Python Client for Google Cloud Vision
How to detect text using OpenCV
Here is one way to do that in Python/OpenCV
- Read the input
- Convert to grayscale
- Threshold
- Use morphology to remove small white or black regions and to close over the text with white
- Get the largest vertically oriented rectangle's contour
- Extract the text from the bounding box of that contour
- Save results
Input:
import cv2
import numpy as np
# load image
img = cv2.imread("rock.jpg")
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold image
thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)[1]
# apply morphology to clean up small white or black regions
kernel = np.ones((5,5), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
morph = cv2.morphologyEx(morph, cv2.MORPH_OPEN, kernel)
# thin region to remove excess black border
kernel = np.ones((3,3), np.uint8)
morph = cv2.morphologyEx(morph, cv2.MORPH_ERODE, kernel)
# find contours
cntrs = cv2.findContours(morph, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cntrs = cntrs[0] if len(cntrs) == 2 else cntrs[1]
# Contour filtering -- keep largest, vertically oriented object (h/w > 1)
area_thresh = 0
for c in cntrs:
area = cv2.contourArea(c)
x,y,w,h = cv2.boundingRect(c)
aspect = h / w
if area > area_thresh and aspect > 1:
big_contour = c
area_thresh = area
# extract region of text contour from image
x,y,w,h = cv2.boundingRect(big_contour)
text = img[y:y+h, x:x+w]
# extract region from thresholded image
binary_text = thresh[y:y+h, x:x+w]
# write result to disk
cv2.imwrite("rock_thresh.jpg", thresh)
cv2.imwrite("rock_morph.jpg", morph)
cv2.imwrite("rock_text.jpg", text)
cv2.imwrite("rock_binary_text.jpg", binary_text)
cv2.imshow("THRESH", thresh)
cv2.imshow("MORPH", morph)
cv2.imshow("TEXT", text)
cv2.imshow("BINARY TEXT", binary_text)
cv2.waitKey(0)
cv2.destroyAllWindows()
Thresholded image:
Morphology cleaned image:
Extracted text region image:
Extracted binary text region image:
Related Topics
Undefined Symbols for Architecture X86_64: Which Architecture Should I Use
C++ Initializing Non-Static Member Array
Is There a C++ Equivalent to Getcwd
Store Results of Std::Stack .Pop() Method into a Variable
How to Call a Cmake Function from Add_Custom_Target/Command
Extract C++ Template Parameters
Opencv's Canny Edge Detection in C++
In Which Versions of the C++ Standard Does "(I+=10)+=10" Have Undefined Behaviour
Cmake Link Library Target Link Error
Win32 C/C++ Load Image from Memory Buffer
System() Calls in C++ and Their Roles in Programming
Redirect Both Cout and Stdout to a String in C++ for Unit Testing