Python multithreading best practices
i just recently read an article about the GIL (Global Interpreter Lock) in python. Which seems to be some big issue when it comes to Python performance. So i was wondering myself what would be the best practice to archive more performance. Would it be threading or either multiprocessing? Because i hear everybody say something different, it would be nice to have one clear answer. Or at least to know the pros and contras of multithreading against multiprocessing.
Kind regards,
Dirk
It depends on the application, and on the python implementation that you are using.
In CPython (the reference implementation) and pypy the GIL only allows one thread at a time to execute Python bytecode. Other threads might be doing I/O or running extensions written in C.
It is worth noting that some other implementations like IronPython and JPython don't have a GIL.
A characteristic of threading is that all threads share the same interpreter and all the live objects. So threads can share global data almost without extra effort. You need to use locking to serialize access to data, though! Imagine what would happen if two threads would try to modifiy the same list.
Multiprocessing actually runs in different processes. That sidesteps the GIL, but if large amounts of data need to be shared between processes that data has to be pickled and transported to another process via IPC where it has to be unpickled again. The multiprocessing module can take care of the messy details for you, but it still adds overhead.
So if your program wants to run Python code in parallel but doesn't need to share huge amounts of data between instances (eg just filenames of files that need to be processed), multiprocessing is a good choice.
Currently multiprocessing is the only way that I'm aware of in the standard library to use all the cores of your CPU at the same time.
On the other hand if your tasks need to share a lot of data and most of the processing is done in extension or is I/O, threading would be a good choice.
链接地址: http://www.djcxy.com/p/55218.html上一篇: .exe中的Python子进程
下一篇: Python多线程最佳实践