Parallelize function using OpenMP
I'm trying to run code in parallel, but I'm confused with private/shared, etc. stuff related to openmp. I'm using c++ (msvc12 or gcc) and openmp.
The code iterates over the loop which consists of a block that should be run in parallel followed by a block that should be run when all the parallel stuff is done. It doesn't matter in which order the parallel stuff is processed. The code looks like this:
// some X, M, N, Y, Z are some constant values
const int processes = 4;
std::vector<double> vct(X);
std::vector<std::vector<double> > stackVct(processes, std::vector<double>(Y));
std::vector<std::vector<std::string> > files(processes, M)
for(int i=0; i < N; ++i)
{
// parallel stuff
for(int process = 0; process < processes; ++process)
{
std::vector<double> &otherVct = stackVct[process];
const std::vector<std::string> &my_files = files[process];
for(int file = 0; file < my_files.size(); ++file)
{
// vct is read-only here, the value is not modified
doSomeOtherStuff(otherVct, vct);
// my_files[file] is read-only
std::vector<double> thirdVct(Y);
doSomeOtherStuff(my_files[file], thirdVct(Y));
// thirdVct and vct are read-only
doSomeOtherStuff2(thirdVct, otherVct, vct);
}
}
// when all the parallel stuff is done, do this job
// single thread stuff
// stackVct is read-only, vct is modified
doSingleTheadStuff(vct, stackVct)
}
If it is better for performance, "doSingleThreadSuff(...)" can be moved into the parallel loop, but it needs to be processed by a single thread. The order of functions in the most inner loop cannot be changed.
How should I declare #pragma omp stuff to make it working? Thanks!
To run a for loop in parallel is just #pragma omp parallel for
above the for
loop statement and whatever variables are declared outside the for loop are shared by all the threads and whatever variables are declared inside the for loop are private to each thread.
Note that if you are doing file IO in parallel you may not see much speedup (next to none if all you are doing is file IO) unless at least some of the files reside on different physical hard drives.
Maybe something like this (mind you this is just a sketch, I did not verify it but you can get the idea):
// some X, M, N, Y, Z are some constant values
const int processes = 4;
std::vector<double> vct(X);
std::vector<std::vector<double> > stackVct(processes, std::vector<double>(Y));
std::vector<std::vector<std::string> > files(processes, M)
for(int i=0; i < N; ++i)
{
// parallel stuff
#pragma omp parallel firstprivate(vct, files) shared(stackVct)
{
#pragma omp for
for(int process = 0; process < processes; ++process)
{
std::vector<double> &otherVct = stackVct[process];
const std::vector<std::string> &my_files = files[process];
for(int file = 0; file < my_files.size(); ++file)
{
// vct is read-only here, the value is not modified
doSomeOtherStuff(otherVct, vct);
// my_files[file] is read-only
std::vector<double> thirdVct(Y);
doSomeOtherStuff(my_files[file], thirdVct(Y));
// thirdVct and vct are read-only
doSomeOtherStuff2(thirdVct, otherVct, vct);
}
}
// when all the parallel stuff is done, do this job
// single thread stuff
// stackVct is read-only, vct is modified
#pragma omp single nowait
doSingleTheadStuff(vct, stackVct)
}
}
vct
and files
as first private because they are read only and I assumed they should not be modified, so each thread will get a copy of these variables for itself. stackVct
is marked as shared among all threads because they modify it. doSingleTheadStuff
function without forcing other threads to wait. 下一篇: 使用OpenMP并行化功能