OpenMP with nested loops

2018-06-28 08:45:22

I have few functions that should be applied to matrix of some structures serially. For single thread I use the following code:

for(int t = 0; t < maxT; ++t)
{
    for(int i = 0; i < maxI; ++i)
        for(int j = 0; j < maxJ; ++j)
            function1(i, j);

    for(int i = 0; i < maxI; ++i)
        for(int j = 0; j < maxJ; ++j)
            function2(i, j);
}

Now I'm trying to parallelize that code:

#pragma omp parallel
{
    for(int t = 0; t < maxT; ++t)
    {
        #pragma omp single
        function3(); // call this function once (once for each iteration of t)
        #pragma omp for
        for(int i = 0; i < sizeI; ++i)
            for(int j = 0; j < sizeJ; ++j)
                function1(i, j);

        #pragma omp for
        for(int i = 0; i < sizeI; ++i)
           for(int j = 0; j < sizeJ; ++j)
               function2(i, j);
    }
}

Is it correct? Does it work it the way of reusing threads (not creating new threads team in main loop)?

Update: Explicit barrier is really unnecessary.

Actually, it seems that I was confused when I asked this question - the code example works properly. Now the question is: is it possible to call function (commented line in code) once after #pragma omp parrallel (not to call function3 in each thread in every iteration) . There is #pragma omp atomic to call increment operators and some others, but if I want to call a single instance of an arbitrary function (or, generally, to perform a block of code)?

Mark's comment. I assume that I will handle data races in my parallelized functions. The only question here is: stl containers are not simply thread safe when using OpenMP? ie, if I want to push_back() in std::list from several threads I still need to lock that list manually.

Update 2: I've found that to run single action in parallel section it's needed to use #pragma omp single. So, this question is closed.

Yes this will create one parallel region where every thread will iterate t over the outer loop, and split up the work of the iterations of the i loops among the threads.

Note that a #pragma omp for has an implicit barrier at the end of it, so there is no need for you to also write your explicit barrier. This implicit barrier can be removed using the nowait clause (ie #pragma omp for nowait ).

链接地址: http://www.djcxy.com/p/79242.html

上一篇: 为什么嵌套OpenMP程序需要更多时间执行？

下一篇: OpenMP与嵌套循环