process for cpu intensive task?
So I'm starting to use node.js for a project I'm doing.
When a client makes a request, My node.js server fetches from another server a json and then reformats it into a new json that gets served to this client. However, the json that the node server got from the other server can potentially be pretty big and so that "massaging" of data is pretty cpu intensive.
I've been reading for the past few hours how node.js isn't great for cpu tasks and the main response that I've seen is to spawn a child-process (basically a .js file running through a different instance of node) that deals with any cpu intensive tasks that might block the main event loop.
So let's say I have 20,000 concurrent users, that would mean it would spawn 20,000 os-level jobs as it's running these child-processes.
Does this sound like a good idea? (A different web server would just create 20,000 threads on the same process.)
I'm not sure if I should be running a child-process. But I do need to make a non-blocking cpu intensive task. Any ideas of what I should do?
The V8 Javascript Engine that powers Node is actually pretty fast compared to many server-side languages.
The issue is that Node's Evented Model is very similar to cooperative multitasking -- a particular request's operations will continue until it cedes control back to the Javascript Event Loop, so high CPU tasks will block up the loop (meaning a random selection of users will get perfect performance and another group will get timeouts, instead of performance degrading gracefully with load).
So, for CPU-intensive tasks, there are several solutions you can use:
process.nextTick
between significant chunks of processing to reduce the average latency (while increasing the absolute minimum), basically being more "cooperative" and not letting any one request hog the CPU for a long time. The people who say that don't know how to architect solutions.
NodeJS is exactly what it says, It is a node, and should be treated like such.
In your example, your node instance connects to an external api and grabs json to process and send back.
ie 1. Get // server.com/getJSON 2. Process the json 3. Post // server.com/postJSON
So what do you do? Ask yourself is time an issue? if so then node isnt the solution However if you are more interested in raw processing power so instead of 1 request done in 4 seconds
You are interested in 200 requests finishing in 10 seconds, but each individual one taking about the full 10 seconds.
Diagnose how long your JSON should take to massage, if it is less than 1 second. Just run 4 node instances instead of 1.
However if its more complex than that, Break the json into segments to process. And use asynchronous callbacks to process each segment
process.nextTick(function( doprocess(segment1); process.nextTick(function() {doprocess(segment2)
each doProcess calls the next doProcess
Node js will trade time between requests.
Now Take that solution and scale it too 4 node instances per server, and 2-5 servers
and suddenly you have an extremely scaleable and cost effective solution.
链接地址: http://www.djcxy.com/p/52524.html上一篇: 何时使用线程池?
下一篇: 用于cpu密集型任务的进程?