How to Use Opencv's Connectedcomponentswithstats in Python

How to extract the largest connected component using OpenCV and Python?

I would replace your code with something like this:

def undesired_objects (image):
    image = image.astype('uint8')
    nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(image, connectivity=4)
    sizes = stats[:, -1]

    max_label = 1
    max_size = sizes[1]
    for i in range(2, nb_components):
        if sizes[i] > max_size:
            max_label = i
            max_size = sizes[i]

    img2 = np.zeros(output.shape)
    img2[output == max_label] = 255
    cv2.imshow("Biggest component", img2)
    cv2.waitKey()

The loop on components now finds the component with the biggest area and displays it at the end of the loop.

Tell me if this works for you as I haven't tested it myself.

cv2.connectedComponentsWithStats source code

It's here in C++. The Python package calls the compiled C++ code.

How to order opencv connectedComponentewithStat by area?

stats would be a 2D array with each row holding information about each blob and the last element holding the area of it. So, simply do the following to get the indices from max-area blob to min-area blob in descending order -

np.argsort(-stats[:,-1]) # or np.argsort(stats[:,-1])[::-1]

How to use python OpenCV to find largest connected component in a single channel image that matches a specific value?

Here's the general approach:

Create a new blank image to add the components into
Loop through each distinct non-zero value in your image
Create a mask for each value (giving the multiple blobs per value)
Run connectedComponentsWithStats() on the mask
Find the non-zero label corresponding to the largest area
Create a mask with the largest label and insert the value into the new image at the masked positions

The annoying thing here is step 5, because the value of 0 will usually, but not always be the largest component. So we need to get the largest non-zero component by area.

Here's some code which I think achieves everything (some sample images would be nice to be sure):

import cv2
import numpy as np

img = np.array([
    [1, 0, 1, 1, 2],
    [1, 0, 1, 1, 2],
    [1, 0, 1, 1, 2],
    [1, 0, 1, 1, 2],
    [1, 0, 1, 1, 2]], dtype=np.uint8)

new_img = np.zeros_like(img)                                        # step 1
for val in np.unique(img)[1:]:                                      # step 2
    mask = np.uint8(img == val)                                     # step 3
    labels, stats = cv2.connectedComponentsWithStats(mask, 4)[1:3]  # step 4
    largest_label = 1 + np.argmax(stats[1:, cv2.CC_STAT_AREA])      # step 5
    new_img[labels == largest_label] = val                          # step 6

print(new_img)

Showing the desired output:

[[0 0 1 1 2]
 [0 0 1 1 2]
 [0 0 1 1 2]
 [0 0 1 1 2]
 [0 0 1 1 2]]

To go through the code, first we create the new labeled image, unimaginatively called new_img, filled with zeros to be populated later by the correct label. Then, np.unique() finds the unique values in the image, and I'm taking everything except the first value; note that np.unique() returns a sorted array, so 0 will be the first value and we don't need to find components of zero. For each unique val, create a mask populated with 0s and 1s, and run connected components on this mask. This will label each distinct region with a different label. Then we can grab the largest non-zero labeled component**, create a mask for it, and add that val into the new image at that place.

** This is the annoying bit that looks weird in the code.

largest_label = 1 + np.argmax(stats[1:, cv2.CC_STAT_AREA])

First, you can check out the answer you linked for the shape of the stats array, but each row corresponds to a label (so the label 0 will correspond to the first row, etc), and the column is defined by the integer cv2.CC_STAT_AREA (which is just 4). We'll need to make sure we're looking at the largest non-zero label, so I'm only looking at rows past the first one. Then, grab the index corresponding to the largest area. Since we shaved the zero row off, the index now corresponds to label-1, so add 1 to get the correct label. Then we can mask as usual and insert the value.

How to Use Opencv's Connectedcomponentswithstats in Python

How to extract the largest connected component using OpenCV and Python?

cv2.connectedComponentsWithStats source code

How to order opencv connectedComponentewithStat by area?

How to use python OpenCV to find largest connected component in a single channel image that matches a specific value?

Related Topics

Leave a reply