How to see how many nodes a process is using on a cluster with Sun grid engine?

I am (trying to) run R on a multicore computing cluster with a Sun grid engine. I would like to run R in parallel using the MPI environment and the snow / snowfall parLapply() functions. My code is working at least on my laptop, but to be sure whether it does what it is supposed to on the cluster as well, I have the following questions.

If I request a number of slots / nodes, say 4, how can I check whether a running process actually uses the full number of the requested CPUs? Is there a commend that can show details about the CPU usage on the requested nodes for a process?


In order to verify that the cluster workers really started on the appropriate nodes, I often use the following command right after creating the cluster object:

clusterEvalQ(cl, Sys.info()['nodename'])

This should match the list of allocated nodes reported by the qstat command.

To actually get details on the CPU usage, I often ssh to each node and use commands like top and ps , but that can be painful if there are many nodes to check. We have the Ganglia monitoring system set up on our clusters, so I can use Ganglia's web interface to check various node statistics. You might want to check with your system administrators to see if they have set anything up for monitoring.

链接地址: http://www.djcxy.com/p/53428.html

上一篇: 与星群集Ipython并行插件分布式计算实例使用情况

下一篇: 如何查看某个进程在具有Sun Grid Engine的群集上使用的节点数量?