How to Isolate Everything Inside of a Contour, Scale It, and Test the Similarity to an Image

How to isolate everything inside of a contour, scale it, and test the similarity to an image?

This situation is perfect for template matching. The idea is to search and find the location of a template image within a larger image. To perform this method, the template slides over the input image (similar to 2D convolution) where comparison methods are performed to determine pixel similarity. This is the basic idea behind template matching. Unfortunately, this basic method has flaws since it only works if the template image size is the same as the desired item to find in the input image. So if your template image was smaller than the desired region to find in the input image, this method would not work.

To get around this limitation, we can implement scale variant template matching by dynamically rescaling the image using np.linspace(). With each iteration, we resize the input image and keep track of the ratio. We continue resizing until the template image size is larger than the resized image while keeping track of the highest correlation value. A higher correlation value means a better match. Once we iterate through various scales, we find the ratio with the largest match and then compute the coordinates of the bounding box to determine the ROI.


Using your template image:

Sample Image

Here's the detected card highlighted in green. To visualize the process of dynamic template matching, uncomment the section in the code.

Sample Image

Code

import cv2
import numpy as np

# Resizes a image and maintains aspect ratio
def maintain_aspect_ratio_resize(image, width=None, height=None, inter=cv2.INTER_AREA):
# Grab the image size and initialize dimensions
dim = None
(h, w) = image.shape[:2]

# Return original image if no need to resize
if width is None and height is None:
return image

# We are resizing height if width is none
if width is None:
# Calculate the ratio of the height and construct the dimensions
r = height / float(h)
dim = (int(w * r), height)
# We are resizing width if height is none
else:
# Calculate the ratio of the 0idth and construct the dimensions
r = width / float(w)
dim = (width, int(h * r))

# Return the resized image
return cv2.resize(image, dim, interpolation=inter)

# Load template and convert to grayscale
template = cv2.imread('template.png')
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
(tH, tW) = template.shape[:2]
cv2.imshow("template", template)

# Load original image, convert to grayscale
original_image = cv2.imread('1.jpg')
gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
found = None

# Dynamically rescale image for better template matching
for scale in np.linspace(0.1, 3.0, 20)[::-1]:

# Resize image to scale and keep track of ratio
resized = maintain_aspect_ratio_resize(gray, width=int(gray.shape[1] * scale))
r = gray.shape[1] / float(resized.shape[1])

# Stop if template image size is larger than resized image
if resized.shape[0] < tH or resized.shape[1] < tW:
break

# Threshold resized image and apply template matching
thresh = cv2.threshold(resized, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
detected = cv2.matchTemplate(thresh, template, cv2.TM_CCOEFF)
(_, max_val, _, max_loc) = cv2.minMaxLoc(detected)

# Uncomment this section for visualization
'''
clone = np.dstack([thresh, thresh, thresh])
cv2.rectangle(clone, (max_loc[0], max_loc[1]), (max_loc[0] + tW, max_loc[1] + tH), (0,255,0), 2)
cv2.imshow('visualize', clone)
cv2.waitKey(50)
'''

# Keep track of correlation value
# Higher correlation means better match
if found is None or max_val > found[0]:
found = (max_val, max_loc, r)

# Compute coordinates of bounding box
(_, max_loc, r) = found
(start_x, start_y) = (int(max_loc[0] * r), int(max_loc[1] * r))
(end_x, end_y) = (int((max_loc[0] + tW) * r), int((max_loc[1] + tH) * r))

# Draw bounding box on ROI
cv2.rectangle(original_image, (start_x, start_y), (end_x, end_y), (0,255,0), 5)
cv2.imshow('detected', original_image)
cv2.imwrite('detected.png', original_image)
cv2.waitKey(0)

How to print all coordinates inside the Contour opencv

1- use findContours to extract the contours of your image. (convert your image to grayscale, apply binary threshold and canny edge detection before, for better results.)

2- select a contour (filter by area, shape, moments etc.)

3- use pointPolygonTest for all points, to check and save if they are inside the specified contour.

How to select all out of a contour?

using namespace cv;

int main(void)
{
// 'contours' is the vector of contours returned from findContours
// 'image' is the image you are masking

// Create mask for region within contour
Mat maskInsideContour = Mat::zeros(image.size(), CV_8UC1);
int idxOfContour = 0; // Change to the index of the contour you wish to draw
drawContours(maskInsideContour, contours, idxOfContour,
Scalar(255), CV_FILLED); // This is a OpenCV function

// At this point, maskInsideContour has value of 255 for pixels
// within the contour and value of 0 for those not in contour.

Mat maskedImage = Mat(image.size(), CV_8UC3); // Assuming you have 3 channel image

// Do one of the two following lines:
maskedImage.setTo(Scalar(180, 180, 180)); // Set all pixels to (180, 180, 180)
image.copyTo(maskedImage, maskInsideContour); // Copy pixels within contour to maskedImage.

// Now regions outside the contour in maskedImage is set to (180, 180, 180) and region
// within it is set to the value of the pixels in the contour.

return 0;
}

How to automatically adjust the threshold for template matching with opencv?

Since there were lots of comments and hardly any responses, I will summarize the answers for future readers.

First off, your question is almost identical to How to detect paragraphs in a text document image for a non-consistent text structure in Python. Also this thread seems to address the problem you are tackling: Easy ways to detect and crop blocks (paragraphs) of text out of image?

Second, detecting paragraphs in a PDF should not be done with template matching but with one of the following approaches:

  1. Using the canny edge detector in combination with dilation and F1 Score optimization. This is often used for OCR as suggested by fmw42.
  2. Alternatively, you could use Stroke Width Transform (SWT) to identify text which you then group into lines and finally blocks i.e. paragraphs. For OCR these blocks can then be passed to Tesseract (as suggested by fmw42)

The key in any OCR task is to simplify the text detection problem as much as possible by removing disruptive features of the image by altering the image as needed. The more information you have about the image you are processing beforehand the better: change colors, binarize, threshold, dilate, apply filters, etc.

To answer your question on finding the best match in template matching:
Checkout nathancy's answer on template matching. In essence, it comes down to finding the maximum correlation value using minMaxLoc. See this excerpt from Nathancy's answer:

    # Threshold resized image and apply template matching
thresh = cv2.threshold(resized, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
detected = cv2.matchTemplate(thresh, template, cv2.TM_CCOEFF)
(_, max_val, _, max_loc) = cv2.minMaxLoc(detected) ```

Also, a comprehensive guide extracting text blocks from an image (without using template matching) can be found in nathancy's answer in this thread.

Area of a closed contour on a plot using python openCV

The problem is your opening operation at the end. This morphological operation includes a dilation at the end that expands the white contour, increasing its area. Let’s try a different approach where no morphology is involved. These are the steps:

  1. Convert your image to grayscale
  2. Apply Otsu’s thresholding to get a binary image, let’s work with black and white pixels only.
  3. Apply a first flood-fill operation at image location (0,0) to get rid of the outer white space.
  4. Filter small blobs using an area filter
  5. Find the “Curve Canvas” (The white space that encloses the curve) and locate and store its starting point at (targetX, targetY)
  6. Apply a second flood-fill al location (targetX, targetY)
  7. Get the area of the isolated blob with cv2.countNonZero

Let’s take a look at the code:

import cv2
import numpy as np

# Set image path
path = "C:/opencvImages/"
fileName = "cLIjM.jpg"

# Read Input image
inputImage = cv2.imread(path+fileName)
inputCopy = inputImage.copy()

# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)

This is the binary image you get:

Now, let’s flood-fill at the corner located at (0,0) with a black color to get rid of the first white space. This step is very straightforward:

# Flood-fill background, seed at (0,0) and use black color:
cv2.floodFill(binaryImage, None, (0, 0), 0)

This is the result, note how the first big white area is gone:

Let’s get rid of the small blobs applying an area filter. Everything below an area of 100 is gonna be deleted:

# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(binaryImage, connectivity=4)

# Set the minimum pixels for the area filter:
minArea = 100

# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]

# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype('uint8')

This is the result of the filter:

Now, what remains is the second white area, I need to locate its starting point because I want to apply a second flood-fill operation at this location. I’ll traverse the image to find the first white pixel. Like this:

# Get Image dimensions:
height, width = filteredImage.shape

# Store the flood-fill point here:
targetX = -1
targetY = -1

for i in range(0, width):
for j in range(0, height):
# Get current binary pixel:
currentPixel = filteredImage[j, i]
# Check if it is the first white pixel:
if targetX == -1 and targetY == -1 and currentPixel == 255:
targetX = i
targetY = j

print("Flooding in X = "+str(targetX)+" Y: "+str(targetY))

There’s probably a more elegant, Python-oriented way of doing this, but I’m still learning the language. Feel free to improve the script (and share it here). The loop, however, gets me the location of the first white pixel, so I can now apply a second flood-fill at this exact location:

# Flood-fill background, seed at (targetX, targetY) and use black color:
cv2.floodFill(filteredImage, None, (targetX, targetY), 0)

You end up with this:

As you see, just count the number of non-zero pixels:

# Get the area of the target curve:
area = cv2.countNonZero(filteredImage)

print("Curve Area is: "+str(area))

The result is:

Curve Area is: 1510


Related Topics



Leave a reply



Submit