How can I directly pass a process from local R to an Amazon EC

2018-06-18 23:07:50

I've been looking into running R on EC2, but I'm wondering what the deal is with parallel/cluster computing is with this setup. I've had a look around but I haven't been able to find a tutorial for this.

Basically what I'm looking to do is have R (Rstudio) running on my laptop, and do most of the work on that, but then when I have a big operation to run, explicitly pass it to an AWS slave instance to do all the heavy lifting.

As far as I can see, snow/snowfall packages seem to be the answer... but I'm not really sure how.

I'm using the tutorial on http://bioconductor.org/help/bioconductor-cloud-ami/ (the ssh one) to have R running. This tutorial does mention paralell/cluster, but it seems to be between different AWS instances.

Any help would be great. Cheers.

If you need only one slave instance I've found it's easiest to just run it in parallel on the instance rather than using your PC as a master.

You can write the script on your PC and push it up to a multicore server with R running on it and then run it on there using all cores in parallel.

For example upload this to a 4 core AWS instance:

library(snowfall)
sfInit(parallel=TRUE,cpus=4,slaveOutfile="log.txt")

vars = c(1:100)

#send variables to all processors
sfExportAll()

#Run this in parallel
results = sfLapply(vars, exp)

#Stop parallel processing
sfStop()

#save results
save(results, file = "results.RData")

链接地址: http://www.djcxy.com/p/53424.html

上一篇: 在R中使用雪并行处理

下一篇: 我如何直接将本地R的流程传递给Amazon EC