Multithreading: What is the point of more threads than cores?

2018-06-12 21:50:09

I thought the point of a multi-core computer is that it could run multiple threads simultaneously. In that case, if you have a quad-core machine, what's the point of having more than 4 threads running at a time? Wouldn't they just be stealing time from each other?

Just because a thread exists doesn't always mean it's actively running. Many applications of threads involve some of the threads going to sleep until it's time for them to do something - for instance, user input triggering threads to wake up, do some processing, and go back to sleep.

Essentially, threads are individual tasks that can operate independently of one another, with no need to be aware of the progress of another task. It's quite possible to have more of these than you have ability to run simultaneously; they're still useful for convenience even if they sometimes have to wait in line behind one another.

The answer revolves around the purpose of threads, which is parallelism: to run several separate lines of execution at once. In an 'ideal' system, you would have one thread executing per core: no interruption. In reality this isn't the case. Even if you have four cores and four working threads, your process and it threads will constantly be being switched out for other processes and threads. If you are running any modern OS, every process has at least one thread, and many have more. All these processes are running at once. You probably have several hundred threads all running on your machine right now. You won't ever get a situation where a thread runs without having time 'stolen' from it. (Well, you might if it's running real-time, if you're using a realtime OS or, even on Windows, use a real-time thread priority. But it's rare.)

With that as background, the answer: Yes, more than four threads on a true four-core machine may give you a situation where they 'steal time from each other', but only if each individual thread needs 100% CPU . If a thread is not working 100% (as a UI thread might not be, or a thread doing a small amount of work or waiting on something else) then another thread being scheduled is actually a good situation.

It's actually more complicated than that:

What if you have five bits of work that all need to be done at once? It makes more sense to run them all at once, than to run four of them and then run the fifth later.

It's rare for a thread to genuinely need 100% CPU. The moment it uses disk or network I/O, for example, it may be potentially spend time waiting doing nothing useful. This is a very common situation.

If you have work that needs to be run, one common mechanism is to use a threadpool. It might seem to make sense to have the same number of threads as cores, yet the .Net threadpool has up to 250 threads available per processor. I'm not certain why they do this, but my guess is to do with the size of the tasks that are given to run on the threads.

So: stealing time isn't a bad thing (and isn't really theft, either: it's how the system is supposed to work.) Write your multithreaded programs based on the kind of work the threads will do, which may not be CPU-bound. Figure out the number of threads you need based on profiling and measurement. You may find it more useful to think in terms of tasks or jobs, rather than threads: write objects of work and give them to a pool to be run. Finally, unless your program is truly performance-critical, don't worry too much :)

The point is that, despite not getting any real speedup when thread count exceeds core count, you can use threads to disentangle pieces of logic that should not have to be interdependent.

In even a moderately complex application, using a single thread try to do everything quickly makes hash of the 'flow' of your code. The single thread spends most of its time polling this, checking on that, conditionally calling routines as needed, and it becomes hard to see anything but a morass of minutiae.

Contrast this with the case where you can dedicate threads to tasks so that, looking at any individual thread, you can see what that thread is doing. For instance, one thread might block waiting on input from a socket, parse the stream into messages, filter messages, and when a valid message comes along, pass it off to some other worker thread. The worker thread can work on inputs from a number of other sources. The code for each of these will exhibit a clean, purposeful flow, without having to make explicit checks that there isn't something else to do.

Partitioning the work this way allows your application to rely on the operating system to schedule what to do next with the cpu, so you don't have to make explicit conditional checks everywhere in your application about what might block and what's ready to process.

链接地址: http://www.djcxy.com/p/36868.html

上一篇: 什么是“线程”（真的）？

下一篇: 多线程：与内核相比，多线程的意义何在？