MPI程序变量使用堆栈或堆

2018-06-28 08:53:38

我最近开始与Intel MPI合作，对一个非常简单的流程求解器进行并行化。我想知道我是否应该使用栈来存储我的变量（即使用datatype name[size];对于声明）或堆（即使用datatype *name = (datatype *)malloc(size*sizeof(datatype)); ）。

首先我使用了malloc，因为我将流场分成了n部分，其中n是创建的进程数，我认为对于n的所有值使用相同的代码会很好。这意味着我的数组的大小在运行时首先已知。所以我显然需要动态内存分配。到现在为止还挺好。

但由于动态分配，这使得整个程序非常缓慢。连续解决问题甚至更慢。

我改变了我的程序并使用了数组声明，并获得了预期的加速。但现在我的程序无法适应不同的开始条件（例如进程数量，流场大小，网格点数量...）。

任何人都可以提出建议，解决这个困境的常见做法是什么？很显然，世界上有很多流量解算器具有出色的性能，并且可以适应起始条件。

非常感谢！

编辑：我试图简化我的代码（但它不是MWE）：

int main(int argc, char** argv)
{
    int rank, numProcs, start, length, left, right;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &numProcs);

    if (rank==0)
    {
        split_and_send_domain(&start, &length, &left, &right, numProcs);                                                                            

        // allocate memory for arrays like velocity, temperature,...
        double x1[length]; // double *x1=(double *)malloc(length*sizeof(double));
        double x2[length]; // double *x2=(double *)malloc(length*sizeof(double));
        ...
        double xn[length]; // double *xn=(double *)malloc(length*sizeof(double));

        // initialize variables like local residual, global residual, iteration step,...
        int res = 1, resGlob=1; iter=0,...;
        int keepOn = 1;

        setupCalculation(start, length, left, right, x1, x2, ...); // initializes the arrays

        MPI_Barrier(MPI_COMM_WORLD);

        while (keepOn){
            iter++;
            pass_boundaries(left, right, length, utilde, rank);
            do_calculation(length, x1, x2, ...);
            calc_own_residual(length, x1, x2, ...);         
            calc_glob_residual(&resGlob, res);

            if (iter>=maxiter || resGlob<1e-8)  keepOn = 0;

            MPI_Bcast(&keepOn, 1, MPI_INT, 0, MPI_COMM_WORLD);
            MPI_Barrier(MPI_COMM_WORLD);
        }

        /* gather results & do some final calculations & output*/
    }
    else
    {
        receive_domain(&start, &length, &left, &right);                                                                         

        // allocate memory for arrays like velocity, temperature,...
        double x1[length]; // double *x1=(double *)malloc(length*sizeof(double));
        double x2[length]; // double *x2=(double *)malloc(length*sizeof(double));
        ...
        double xn[length]; // double *xn=(double *)malloc(length*sizeof(double));

        // initialize variables like residual, iteration step,...
        int res = 1;
        int keepOn = 1;

        setupCalculation(start, length, left, right, x1, x2, ...); // initializes the arrays

        MPI_Barrier(MPI_COMM_WORLD);

        while (keepOn){
            pass_boundaries(left, right, length, utilde, rank);
            do_calculation(length, x1, x2, ...);
            calc_own_residual(length, x1, x2, ...);
            calc_glob_residual(&resGlob, res);

            MPI_Bcast(&keepOn, 1, MPI_INT, 0, MPI_COMM_WORLD);
            MPI_Barrier(MPI_COMM_WORLD);
        }
    }

    MPI_Finalize();
}

当使用动态分配时，在开始时额外计算int length ，否则将其设置为全局常量变量。

链接地址: http://www.djcxy.com/p/79257.html

上一篇: Use stack or heap for MPI program variables

下一篇: Stack vs. heap pointers in C