mpi programming model without GPUDirect

I am using a GPU cluster without GPUDirect support. From this briefing, the following is done when transferring GPU data across nodes:

  • GPU writes to pinned sysmem1
  • CPU copies from sysmem1 to sysmem2
  • Infiniband driver copies from sysmem2
  • Now I am not sure whether the second step is an implicit step when I transfer sysmem1 across Infiniband using MPI. By assuming this, my current programming model is something like this:

  • cudaMemcpy(hostmem, devicemem, size, cudaMemcpyDeviceToHost).
  • MPI_Send(hostmem,...)
  • Is my above assumption true and will my programming model work without causing communication issues?


    Yes, you can use CUDA and MPI independently (ie without GPUDirect), just as you describe.

  • Move the data from device to host
  • Transfer the data as you ordinarily would, using MPI
  • You might be interested in this presentation, which explains CUDA-aware MPI, and gives an example side-by-side on slide 11 of non-cuda MPI and CUDA-MPI

    链接地址: http://www.djcxy.com/p/64620.html

    上一篇: 适用于Windows的RDMA CM

    下一篇: 没有GPUDirect的mpi编程模型