Rectangle detection / tracking using OpenCV

What I need

I'm currently working on an augmented reality kinda game. The controller that the game uses (I'm talking about the physical input device here) is a mono colored, rectangluar pice of paper. I have to detect the position, rotation and size of that rectangle in the capture stream of the camera. The detection should be invariant on scale and invariant on rotation along the X and Y axes.

The scale invariance is needed in case that the user moves the paper away or towards the camera. I don't need to know the distance of the rectangle so scale invariance translates to size invariance.

The rotation invariance is needed in case the user tilts the rectangle along its local X and / or Y axis. Such a rotation changes the shape of the paper from rectangle to trapezoid. In this case, the object oriented bounding box can be used to measure the size of the paper.

What I've done

At the beginning there is a calibration step. A window shows the camera feed and the user has to click on the rectangle. On click, the color of the pixel the mouse is pointing at is taken as reference color. The frames are converted into HSV color space to improve color distinguishing. I have 6 sliders that adjust the upper and lower thresholds for each channel. These thresholds are used to binarize the image (using opencv's inRange function).
After that I'm eroding and dilating the binary image to remove noise and unite nerby chunks (using opencv's erode and dilate functions).
The next step is finding contours (using opencv's findContours function) in the binary image. These contours are used to detect the smallest oriented rectangles (using opencv's minAreaRect function). As final result I'm using the rectangle with the largest area.

A short conclusion of the procedure:

  • Grab a frame
  • Convert that frame to HSV
  • Binarize it (using the color that the user selected and the thresholds from the sliders)
  • Apply morph ops (erode and dilate)
  • Find contours
  • Get the smallest oriented bouding box of each contour
  • Take the largest of those bounding boxes as result
  • As you may noticed, I don't make an advantage of the knowledge about the actual shape of the paper, simply because I don't know how to use this information properly.

    I've also thought about using the tracking algorithms of opencv. But there were three reasons that prevented me from using them:

  • Scale invariance: as far as I read about some of the algorithms, some don't support different scales of the object.
  • Movement prediction: some algorithms use movement prediction for better performance, but the object I'm tracking moves completely random and therefore unpredictable.
  • Simplicity: I'm just looking for a mono colored rectangle in an image, nothing fancy like car or person tracking.
  • Here is a - relatively - good catch (binary image after erode and dilate) 好

    and here is a bad one 坏

    The Question

    How can I improve the detection in general and especially to be more resistant against lighting changes?

    Update

    Here are some raw images for testing.

    Can't you just use thicker material?
    Yes I can and I already do (unfortunately I can't access these pieces at the moment). However, the problem still remains. Even if I use material like cartboard. It isn't bent as easy as paper, but one can still bend it.

    How do you get the size, rotation and position of the rectangle?
    The minAreaRect function of opencv returns a RotatedRect object. This object contains all the data I need.

    Note
    Because the rectangle is mono colored, there is no possibility to distinguish between top and bottom or left and right. This means that the rotation is always in range [0, 180] which is perfectly fine for my purposes. The ratio of the two sides of the rect is always w:h > 2:1 . If the rectangle would be a square, the range of roation would change to [0, 90] , but this can be considered irrelevant here.

    As suggested in the comments I will try histogram equalization to reduce brightness issues and take a look at ORB, SURF and SIFT.

    I will update on progress.


    The H channel in the HSV space is the Hue, and it is not sensitive to the light changing. Red range in about [150,180].

    Based on the mentioned information, I do the following works.

  • Change into the HSV space, split the H channel, threshold and normalize it.
  • Apply morph ops (open)
  • Find contours, filter by some properties( width, height, area, ratio and so on).
  • PS. I cannot fetch the image you upload on the dropbox because of the NETWORK. So, I just use crop the right side of your second image as the input.

    在这里输入图像描述

    imgname = "src.png"
    img = cv2.imread(imgname)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    ## Split the H channel in HSV, and get the red range
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    h,s,v = cv2.split(hsv)
    h[h<150]=0
    h[h>180]=0
    
    ## normalize, do the open-morp-op
    normed = cv2.normalize(h, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8UC1)
    kernel = cv2.getStructuringElement(shape=cv2.MORPH_ELLIPSE, ksize=(3,3))
    opened = cv2.morphologyEx(normed, cv2.MORPH_OPEN, kernel)
    res = np.hstack((h, normed, opened))
    cv2.imwrite("tmp1.png", res)
    

    Now, we get the result as this (h, normed, opened):

    在这里输入图像描述

    Then find contours and filter them.

    _, contours, _ = cv2.findContours(opened, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
    print(len(contours))
    
    bboxes = []
    rboxes = []
    cnts = []
    dst = img.copy()
    for cnt in contours:
        ## Get the stright bounding rect
        bbox = cv2.boundingRect(cnt)
        x,y,w,h = bbox
        if w<30 or h < 30 or w*h < 2000 or w > 500:
            continue
    
        ## Draw rect
        cv2.rectangle(dst, (x,y), (x+w,y+h), (255,0,0), 1, 16)
    
        ## Get the rotated rect
        rbox = cv2.minAreaRect(cnt)
        (cx,cy), (w,h), rot_angle = rbox
        print("rot_angle:", rot_angle)  
    
        ## backup 
        bboxes.append(bbox)
        rboxes.append(rbox)
        cnts.append(cnt)
    

    The result is like this:

    rot_angle: -2.4540319442749023
    rot_angle: -1.8476102352142334
    

    在这里输入图像描述

    Because the blue rectangle tag in the source image, the card is splited into two sides. But a clean image will have no problem.


    I know it's been a while since I asked the question. I recently continued on the topic and solved my problem (although not through rectangle detection).

    Changes

  • Using wood to strengthen my controllers (the "rectangles") like below.
  • Placed 2 ArUco markers on each controller.
  • 调节器

    How it works

  • Convert the frame to grayscale,
  • downsample it (to increase performance during detection),
  • equalize the histogram using cv::equalizeHist ,
  • find markers using cv::aruco::detectMarkers ,
  • correlate markers (if multiple controllers),
  • analyze markers (position and rotation),
  • compute result and apply some error correction.
  • It turned out that the marker detection is very robust to lighting changes and different viewing angles which allows me to skip any calibration steps.

    I placed 2 markers on each controller to increase the detection robustness even more. Both markers has to be detected only one time (to measure how they correlate). After that, it's sufficient to find only one marker per controller as the other can be extrapolated from the previously computed correlation.

    Here is a detection result in a bright environment:

    在明亮的环境中进行检测

    in a darker environment:

    在黑暗环境中检测

    and when hiding one of the markers (the blue point indicates the extrapolated marker postition):

    缺失标记的检测

    Failures

    The initial shape detection that I implemented didn't perform well. It was very fragile to lighting changes. Furthermore, it required an initial calibration step.

    After the shape detection approach I tried SIFT and ORB in combination with brute force and knn matcher to extract and locate features in the frames. It turned out that mono colored objects don't provide much keypoints (what a surprise). The performance of SIFT was terrible anyway (ca. 10 fps @ 540p). I drew some lines and other shapes on the controller which resulted in more keypoints beeing available. However, this didn't yield in huge improvements.

    链接地址: http://www.djcxy.com/p/39512.html

    上一篇: python opencv中不规则的形状检测和测量

    下一篇: 使用OpenCV进行矩形检测/跟踪