why is nested OpenMP program is taking more time in executing?

My OpenMP program of matrix multiplication which consists of nesting of for loops is taking more execution time than the non-nested version of the parallel program. This is the block where I have used nested parallelisation.

pragma omp parallel

omp_set_nested(1);

#pragma omp parallel for
    for(i=0;i<N;i++) {
    #pragma omp parallel for
        for(j=0;j<N;j++) {
            C[i][j]=0.; // set initial value of resulting matrix C = 0
            #pragma omp parallel for
                for(m=0;m<N;m++) {
                    C[i][j]=A[i][m]*B[m][j]+C[i][j];
                }
            printf("C:i=%d j=%d %f n",i,j,C[i][j]);
      }
   }
链接地址: http://www.djcxy.com/p/79244.html

上一篇: 是否有机器,其中sizeof(char)!= 1,或者至少是CHAR

下一篇: 为什么嵌套OpenMP程序需要更多时间执行?