How to understand the average pixel number described in the frequency image?

I'm trying to implement the widely used fingerprint image enhancement algorithm proposed by Anil Jain et al. While implementing the steps for ridge frequency image calculation in Section 2.5, I have difficulties in understanding some description. The steps are described as follows:

  • Obtain normalized image G.
  • Divide G into blocks of size wxw (16 x 16).
  • For each block centered at the pixel (i, j), compute an oriented window of size lxw (32x16) that is defined in the ridge coordinate system.
  • For each block centered at pixel (i,j), compute the x-signature, X[0], X1, ..., X[l-1], of the ridges and valleys within the oriented window, where
  • 在这里输入图像描述

    If no minutiae and singular points appear in the oriented window, the x-signature forms a discrete sinusoidal-shape wave, which has the same frequency as that of the ridges and valleys int he oriented window. Therefore, the frequency of ridges and valleys can be estimated from the x-signature. Let T(i,j) be the average number of pixels between two consecutive peaks in the x-signature, then the frequency is computed as:

    My question is : I don't understand how to get the average number of pixels between two consecutive peaks, since the paper didn't mention how to differentiate peaks within the algorithm. So, how to decide those peak pixels to count them? Could someone explain me what did I miss here?

    Besides, I implemented the steps up-to-here using OpenCV like this, I would really appreciate if someone could go through my steps to help me double check that I am implementing correctly:

    void Enhancement::frequency(cv::Mat inputImage, cv::Mat orientationMat)
    {
        int blockSize = 16;
        int windowSize = 32;
    
        //compute x-signature
        for (int i = blockSize / 2; i < inputImage.rows - blockSize / 2; i += blockSize)
        {
            for (int j = blockSize / 2; j < inputImage.cols - blockSize / 2; j += blockSize)
            {
                int u = 0; 
                int v = 0;
                std::vector<float> xSignature;
    
                for (int k = 0; k < windowSize; k++)            
                {
                    float sum = 0.0;
    
                    for (int d = 0; d < blockSize; d++)
                    {
                        float pixel = orientationMat.at<float>(i, j);
    
                        u = i + (d - 0.5 * blockSize) * cos(pixel) + (k - 0.5 * windowSize) * sin(pixel);
                        v = j + (d - 0.5 * blockSize) * sin(pixel) + (0.5 - windowSize) * cos(pixel);
                        sum += static_cast<float>(inputImage.at<uchar>(u, v));
                    }
    
                    xSignature.push_back(sum);
                }
            } // end of j-loop
        } // end of i-loop
    
    }
    

    Update

    After searching some articles, I found someone mentioned about how to determine whether a peak pixel like this:

  • Perform grayscale dilation on each block
  • Find where the dilation equals original values
  • But still, I didn't understand it clearly. Does that mean I can employ block-wise morphological dilation operation on my grayscale image (I've already converted my image from RGB to Grayscale in OpenCV before further processing) ? Does the word dilation equals original values means the pixel intensity after morphological dilation equals its original value ? I'm lost here.


    I do not know the specific algorithm you are talking about, but maybe I can offer some general advice.

    I guess the core of the problem is the distinction "what is a peak, what is just noise" in a noisy signal (since RL input images are always noisy in some sense; I think the relevant input vector for peak detection in your code is xSignature). Once you have determined the peaks, calculating an average peak distance should be fairly straightforward.

    As for peak detection, there are tons of papers describing quite sophisticated algorithms, but I'll outline some tried and true methods I'm using in my image processing job.

    Smoothing

    If you know the expected peak width w, you can as a first step apply some smoothing that gets rid of noise on a smaller scale by just summing over a window of about the expected peak width (from xw/2 to x+w/2). You don't actually need to calculate the average value of the sliding window (divide by w), since for peak detection the absolute scale is irrelevant and the sum is proportional to the average value.

    Min-Max-Identification

    You can run over your (potentially smoothed) profile vector and identify minimum and maximum indices (eg by simple slope sign change). Store these positions in a map<int (coordinate), bool (isMax)> or map<int (coordinate), double (value at coordinate)> . Or use a struct as value that holds all the relevant info (bool isMax, double value, bool isAtBoundary, ...)

    Evaluate quality of detected peaks

    For each maximum you found in the previous step, determine the height difference and maybe the slope to both the previous and the following minimum, resulting in a quality. This step depends on your problem domain. Maybe "peaks" need not be framed by a minimum value on both sides (in that case, your minimum detection above would have to be more sophisticated than slope change). Maybe there are min or max width restrictions on peaks. And so on.

    Calculate a quality value based on the above questions for each maximum position. I often use something like Q_max = (average height difference from max to neighboring mins) / (max-min of profile). A peak candidate can then at most have a "quality" of 1, and at least 0.

    Iterate over all your maxima, calculate their qualities and put them into a multimap or some other container, that can be sorted so that you can later iterate over your peaks in descending quality.

    Distinguish peaks from non-peaks

    Iterate over your peaks in descending quality. Possibly sort out all that do not fulfill minimum or maximum width/height/quality/distance to nearest peak with higher quality/... requirements for them to be peaks in your problem domain. Keep the rest. Done.

    In your case, you would then reorder the peaks by coordinate and calculate the average distance between them.

    I know this is vague, but there are no universally true answers for peak detection. Maybe in the paper you are working with there is a specific prescription hidden somewhere, but most authors omit such "mere technicalities" (and typically, if you contact them via email they can't remember or otherwise reproduce how they did it, which renders their results basically irreproducible).

    链接地址: http://www.djcxy.com/p/89806.html

    上一篇: 比较OpenCV中的强度像素值Vec3b

    下一篇: 如何理解频率图像中描述的平均像素数?