Grabcut algorithm of Python OpenCV Image Processing Algorithm

Grabcut algorithm

Typically, we need to separate the prospects to separate, and sometimes we are just a prospect. This tutorial we will introduce the Grabcut algorithm for interactive foreground extraction.

Grabcut is a method of image segmentation based on graph cutting. The Grabcut algorithm is based on the improvement of the Graph CUT algorithm.

Based on the specified boundary box to be split, use the Gaussian mixed model to estimate the color distribution of the split object and the background (note that the image is divided into the segmentation object and the background). In short, just to confirm the prospects and background input, the algorithm can complete the optimal segmentation of the foreground and background.

This algorithm utilizes texture (color) information and boundary (contrast) information, as long as a small amount of user interaction can get a better segmentation effect, and the watershed algorithm is similar, but the calculation speed is relatively slow, and the result is more accurate. If the foreground object is extracted from a still image (eg, cut from one image to another image), the Grabcut algorithm is the best choice.

principle

We use RGB color space to model the goals and backgrounds with a total covariance GMM (mixed Gaussian model) of k Gaoso component (a mixed Gau = 5). Then there is an additional vector k = {k1,..., Kn,..., Kn}, where Kn is the nth pixel corresponding to which Gaussian component, kn ∈ {1,... K}. For each pixel, or some Gaussian components of the target GMM are not from a Gaussian component from the background GMM.
Therefore, the GIBBS energy used for the entire image is the following graph 7:

Among them, u is the region item, as in the previous article, you represent a pixel being classified as a target or background penalty, that is, a pixel belongs to the probability of a target or background. We know that the mixed Gaussian density model is as follows:
Therefore, it turns into the form of the equation (9) after the logarithm, wherein the parameter θ of the GMM is three: each Gaussian component weight π, each Gaussian component mean metric u (because there are RGB three channels) Therefore, three element vectors) and covariance matrices (because there are RGB three channels, it is 3X3 matrix). As in its formula (10). That is to say, the three parameters of the GMM and the description of the GMM described in the background need to be learned. Once the three parameters are identified, then we know that after the RGB color value of the pixel, you can get the GMM of the target and the background of the GMM, you can get the probability of the pixels belonging to the target and background, that is, the area of GIBBS energy The energy item can be determined that we can find the weight of the T-Link of the figure. So what is the weight of N-link? What is the boundary energy item V?
The boundary items are similar to the GRAPH CUT, which reflects the discontinuous punishment between neighborhood pixels M and N, if the two neighborhood pixels are very small, then it is a big possibility of the same goal or the same background, If they are very different, then the two pixels are likely to be in the edge of the target and background, and the possibility of separation is relatively large, so the larger the difference between the two neighborhood pixels, the smaller energy. In RGB space, measure the similarity of two pixels, we use European distance (two norms). The parameter β in this is determined by the contrast of the image. It can be imagined. If the contrast of the image is low, it is said that there is a difference in pixels M and N, and their differences || ZM-Zn || still relatively low, then we need Take a relatively large beta to zoom in, for a high image, then the pixel M and N of the same goal belongs to the same target || ZM-Zn || Still high, then we need to multiply A relatively small β reduces this difference, making the V items work normally in the case of high contrast or low. At this time, we can get the figure we want, we can divide them.

Let's take a look at the specific implementation principle:

(1) The initial TRIMAP T is obtained by the direct frame selection target, that is, the pixels other than the box is used as the background pixel TB, and the pixels of the square tu are all pixels that may be targets.

(2) The label αn = 0 of each pixel N in TB, that is, a background pixel, which is a background pixel; and the label αn = 1 of the initialized pixel N, which may be Target "pixels.

(3) After two steps, we can get some pixels belonging to the target (αn = 1), and the remaining pixels belonging to the background (αn = 0). At this time, we can use this pixel to estimate Goals and background GMM. We can clustering pixels that belong to target and backgrounds as K-Class K-MEAN algorithms, k. GMM, at this time, each Gauss model has some pixel sample sets, this time it The parameter mean and covariance can be obtained by estimating their RGB value, and the gravity of the Gaussian component can be determined by the ratio of the number of pixels belonging to the Gaussian component and the total number of pixels.

Use in OpenCV

Implementation steps:

1. Define the rectangle containing (one or more) objects in the picture

2. The area outside the rectangular is automatically considered to be the background

3. For user-defined rectangular areas, the data available in the background is available in the foreground and background area.

4. Use the Gaussian mix model to model background and foreground, and labeled undefined pixels as possible foreground or background

5. Each pixel in the image is considered to be connected to the surrounding pixels via the virtual edge, while each side has a probability of a foreground or background, which is based on its similarity to the surrounding color.

6. Each pixel (ie node in the algorithm) will link with a foreground or background node

7. After the node completes the link, if the side between the nodes belongs to the different terminals, the edges between them can be cut, which can divide the images in each part.

Let's first understand the relevant function API:

mask, bgdModel, fgdModel = CV2.grabCut(img, mask, rect, bgdModel, fgdModel, iterCount[, mode])

IMG: Enter an image
Mask: Mask Image, specify which area is background, foreground, or possible background / foreground. It is made from the following flag, CV2.GC_BGD, CV2.GC_FGD, CV2.GC_PR_BGD, CV2.GC_PR_FGD, or simply will 0, 1, 2, 3 passed to the image.
Rect: The coordinates of the rectangle contains the format of the foreground object (X, Y, W, H)
BDGMODEL, FGDMODEL: Architects used inside the algorithm, only need to create a 0 array of np.float64 types of size (1,65).
Itercount: Algorithm runs iterations.
mode
: CV2.GC_INIT_WITH_RECT or CV2.GC_INIT_WITH_MASK, or combined to determine that we are rectification or final contacts.

img = cv2.imread('data.jpg')
    img = cv2.resize(img, (224, 224), interpolation=cv2.INTER_CUBIC)
    print(img.shape)


    mask = np.zeros(img.shape[:2], np.uint8)

    bgdModel = np.zeros((1, 65), np.float64)
    fgdModel = np.zeros((1, 65), np.float64)

    rect = (10,10,224,224)
    cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_RECT)

    mask2 = np.where((mask == 2) | (mask == 0), 0, 1).astype('uint8')
    img = img * mask2[:, :, np.newaxis]

    if not os.path.lexists('cut'):
        os.makedirs('cut')

    cv2.imwrite('cut/1.jpg', img)

Intelligent Recommendation

GrabCut image segmentation algorithm

Brief description After the GrabCut algorithm selects the target area in the frame, the part outside the frame is regarded as the background area, and the area inside the frame is regarded as t...

Image segmentation—GrabCut algorithm

Article Directory 1. Theory Overview 2. GrabCut algorithm function in OpenCV: 1. Theory Overview Grabcut is an image segmentation algorithm based on graph cut. It requires the user to input a bounding...

Six, OpenCV-python image processing (Ⅳ)-template matching/watershed/GrabCut

1. Template matching 1. Single goal The principle of single-target template matching: the template image performs a sliding operation on the input image (similar to 2D convolution), the template image...

OpenCV learning note (thirty-one) image prospect extraction grabcut algorithm

The results show that...

OpenCV-Python - Chapter 23: Interactive foreground extraction using the GrabCut algorithm

table of Contents 0 principle 1 demo 0 principle The GrabCut algorithm was jointly proposed by Carsten_Rother, Vladimir_Kolmogorov and Andrew_Blake of Microsoft Cambridge Res...