Mandelbrot optimization in openmp

Well i have to paralellisize the mandelbrot program in C. I think i have done it well and i cant get better times. My question if someone has an idea to improve the code, ive been thinking perhaps in nested parallel regions between the outer and insider for...

Also i have doubts if its more elegant or recommended to put all the pragmas in a single line or to write separate pragmas ( one for omp parallel and shared and private variables and a conditional, and another pragma with omp for and schedule dynamic).

Ive the doubt if constants can be used as private variables because i think is cleaner to have constants instead of defined variables.

Also i have written a conditional ( if numcpu >1) it has no sense to use parallel region and make a normal sequential execution.

Finally as i have read the dynamic chunk it depends on hardware and your system configuration... so i have left it as a constant, so it can be easily changed.

Also i adapt the number of threads to the number of processors available..

 int main(int argc, char *argv[])
{
    omp_set_dynamic(1);

    int xactual, yactual;

    //each iteration, it calculates: newz = oldz*oldz + p, where p is the current pixel, and oldz stars at the origin
    double pr, pi;                   //real and imaginary part of the pixel p
    double newRe, newIm, oldRe, oldIm;   //real and imaginary parts of new and old z
    double zoom = 1, moveX = -0.5, moveY = 0; //you can change these to zoom and change position

    pixel_t *pixels = malloc(sizeof(pixel_t)*IMAGEHEIGHT*IMAGEWIDTH);
    clock_t begin, end;
    double time_spent;    

    begin=clock();

    int numcpu;
    numcpu = omp_get_num_procs();

    //FILE * fp;
    printf("El número de procesadores que utilizaremos es: %d", numcpu);

    omp_set_num_threads(numcpu);

    #pragma omp parallel shared(pixels, moveX, moveY, zoom) private(xactual, yactual, pr, pi, newRe, newIm) (if numcpu>1)
    {
        //int xactual=0;
    //  int yactual=0;
        #pragma omp for  schedule(dynamic, CHUNK)       

    //loop through every pixel
        for(yactual = 0; yactual < IMAGEHEIGHT; yactual++)
            for(xactual = 0; xactual < IMAGEWIDTH; xactual++)
            {
                //calculate the initial real and imaginary part of z, based on the pixel location and zoom and position values
            pr = 1.5 * (xactual - IMAGEWIDTH / 2) / (0.5 * zoom * IMAGEWIDTH) + moveX;
            pi = (yactual - IMAGEHEIGHT / 2) / (0.5 * zoom * IMAGEHEIGHT) + moveY;
            newRe = newIm = oldRe = oldIm = 0; //these should start at 0,0
            //"i" will represent the number of iterations
            int i;
            //start the iteration process
            for(i = 0; i < ITERATIONS; i++)
            {
                //remember value of previous iteration
                oldRe = newRe;
                oldIm = newIm;
                //the actual iteration, the real and imaginary part are calculated
                newRe = oldRe * oldRe - oldIm * oldIm + pr;
                newIm = 2 * oldRe * oldIm + pi;
                //if the point is outside the circle with radius 2: stop
                if((newRe * newRe + newIm * newIm) > 4) break;
            }

            //            color(i % 256, 255, 255 * (i < maxIterations));
            if(i == ITERATIONS)
            {
                //color(0, 0, 0); // black
                pixels[yactual*IMAGEWIDTH+xactual][0] = 0;
                pixels[yactual*IMAGEWIDTH+xactual][1] = 0;
                pixels[yactual*IMAGEWIDTH+xactual][2] = 0;
            }
            else
            {
                double z = sqrt(newRe * newRe + newIm * newIm);
                int brightness = 256 * log2(1.75 + i - log2(log2(z))) / log2((double)ITERATIONS);

                //color(brightness, brightness, 255)
                pixels[yactual*IMAGEWIDTH+xactual][0] = brightness;
                pixels[yactual*IMAGEWIDTH+xactual][1] = brightness;
                pixels[yactual*IMAGEWIDTH+xactual][2] = 255;
            }      

      }

    }   //end of parallel region

    end= clock();

    time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
    fprintf(stderr, "Elapsed time: %.2lf seconds.n", time_spent);

You could extend the implementation to leverage SIMD extensions. As far as I know the latest OpenMP standard includes vector constructs. Check out this article that describes the new capabilities.

This whitepaper explains how SSE3 can be used when calculating the Mandelbrot set.

链接地址: http://www.djcxy.com/p/79224.html

上一篇: c#如何将float转换为int

下一篇: 在openmp中进行Mandelbrot优化