Image recognition of well defined but changing angle image
PROBLEM
I have a picture that is taken from a swinging vehicle. For simplicity I have converted it into a black and white image. An example is shown below:
The image shows the high intensity returns and has a pattern in it that is found it all of the valid images is circled in red. This image can be taken from multiple angles depending on the rotation of the vehicle. Another example is here:
The intention here is to attempt to identify the picture cells in which this pattern exists.
CURRENT APPROACHES
I have tried a couple of methods so far, I am using Matlab to test but will eventually be implementing in c++. It is desirable for the algorithm to be time efficient, however, I am interested in any suggestions.
SURF (Speeded Up Robust Features) Feature Recognition
I tried the default matlab implementation of SURF to attempt to find features. Matlab SURF is able to identify features in 2 examples (not the same as above) however, it is not able to identify common ones:
I know that the points are different but the pattern is still somewhat identifiable. I have tried on multiple sets of pictures and there are almost never common points. From reading about SURF it seems like it is not robust to skewed images anyway. Perhaps some recommendations on pre-processing here?
Template Matching
So template matching was tried but is definitely not ideal for the application because it is not robust to scale or skew change. I am open to pre-processing ideas to fix the skew. This could be quite easy, some discussion on extra information on the picture is provided further down.
For now lets investigate template matching: Say we have the following two images as the template and the current image:
The template is chosen from one of the most forward facing images. And using it on a very similar image we can match the position:
But then (and somewhat obviously) if we change the picture to a different angle it won't work. Of course we expect this because the template no-longer looks like the pattern in the image:
So we obviously need some pre-processing work here as well.
Hough Lines and RANSAC
Hough lines and RANSAC might be able to identify the lines for us but then how do we get the pattern position?
Other that I don't know about yet
I am pretty new to the image processing scene so i would love to hear about any other techniques that would suit this simple yet difficult image rec problem.
The sensor and how it will help pre-processing
The sensor is a 3d laser, it has been turned into an image for this experiment but still retains its distance information. If we plot with distance scaled from 0 - 255 we get the following image:
Where lighter is further away. This could definitely help us to align the image, some thoughts on the best way? . So far I have thought of things like calculating the normal of the cells that are not 0, we could also do some sort of gradient descent or least squares fitting such that the difference in the distance is 0, that could align the image so that it is always straight. The problem with that is that the solid white stripe is further away? Maybe we could segment that out? We are sort of building algorithms on our algorithms then so we need to be careful so this doesn't become a monster.
Any help or ideas would be great, I am happy to look into any serious answer!
I came up with the following program to segment the regions and hopefully locate the pattern of interest using template matching. I've added some comments and figure titles to explain the flow and some resulting images. Hope it helps.
im = imread('sample.png');
gr = rgb2gray(im);
bw = im2bw(gr, graythresh(gr));
bwsm = imresize(bw, .5);
dism = bwdist(bwsm);
dismnorm = dism/max(dism(:));
figure, imshow(dismnorm, []), title('distance transformed')
eq = histeq(dismnorm);
eqcl = imclose(eq, ones(5));
figure, imshow(eqcl, []), title('histogram equalized and closed')
eqclbw = eqcl < .2; % .2 worked for samples given
eqclbwcl = imclose(eqclbw, ones(5));
figure, imshow(eqclbwcl, []), title('binarized and closed')
filled = imfill(eqclbwcl, 'holes');
figure, imshow(filled, []), title('holes filled')
% -------------------------------------------------
% template
tmpl = zeros(16);
tmpl(3:4, 2:6) = 1;tmpl(11:15, 13:14) = 1;
tmpl(3:10, 7:14) = 1;
st = regionprops(tmpl, 'orientation');
tmplAngle = st.Orientation;
% -------------------------------------------------
lbl = bwlabel(filled);
stats = regionprops(lbl, 'BoundingBox', 'Area', 'Orientation');
figure, imshow(label2rgb(lbl), []), title('labeled')
% here I just take the largest contour for convenience. should consider aspect ratio and any
% other features that can be used to uniquely identify the shape
[mx, id] = max([stats.Area]);
mxbb = stats(id).BoundingBox;
% resize and rotate the template
tmplre = imresize(tmpl, [mxbb(4) mxbb(3)]);
tmplrerot = imrotate(tmplre, stats(id).Orientation-tmplAngle);
xcr = xcorr2(double(filled), double(tmplrerot));
figure, imshow(xcr, []), title('template matching')
Resized image:
Segmented:
Template matching:
Given the poor image quality (low resolution + binarization), I would prefer template matching because it is based on a simple global measure of similarity and does not attempt to do any feature extraction (there are no reliable features in your samples).
But you will need to apply template matching with rotation. One way is to precompute rotated instances of the template, perform matchings for every angle and keep the best.
It is possible to integrate depth information in the comparison (if that helps).
This is quite similar to the problem of recognising hand-sketched characters that we tackle in our lab, in the sense that the target pattern is binary, low resolution, and liable to moderate deformation.
Based on our experiences I don't think SURF is the right way to go as pointed out elsewhere this assumes a continuous 2D image not binary and will break in your case. Template matching is not good for this kind of binary image either - your pixels need to be only slightly misaligned to return a low match score, as there is no local spatial coherence in the pixel values to mitigate minor misalignments of the window.
Our approach is this scenario is to try to "convert" the binary image into a continuous or "greyscale" image. For example see below:
These conversions are made by running a 1st derivative edge detector eg convolve 3x3 template [0 0 0 ; 1 0 -1 ; 0 0 0] and it's transpose over image I to get dI/dx and dI/dy. At any pixel we can get the edge orientation atan2(dI/dy,dI/dx) from these two fields. We treat this information as known at the sketched pixels (the white pixels in your problem) and unknown at the black pixels. We then use a Laplacian smoothness assumption to extrapolate values for the black pixels from the white ones. Details are in this paper:
http://personal.ee.surrey.ac.uk/Personal/J.Collomosse/pubs/Hu-CVIU-2013.pdf
If this is a major hassle you could try using a distance transform instead, convenient in Matlab using bwdist, but it won't give as accurate results.
Now we have the "continuous" image (as per right hand column of images above). The greyscale patterns encode the local structure in the image, and are much more amenable to gradient based descriptors like SURF and template matching.
My hunch would be to try template match first, but since this is affine sensitive I would go the whole way and use a HOG/Bag of Visual words approach again just as in our above paper, to match those patterns.
We have found this pipeline to give state of the art results in sketch based shape recognition, and my PhD student has successfully used in subsequent work for matching hieroglyphs, so I think it could have a good shot at working the kind of pattern you pose in your example images.
链接地址: http://www.djcxy.com/p/79642.html上一篇: 数字识别的建议
下一篇: 图像识别良好定义但角度变化的图像