python + wsgi on a multi
Suppose that I've written a wsgi application
. I run this application on Apache2
on Linux
with multi-threaded mod-wsgi
configuration, so that my application is run in many threads per single process:
WSGIDaemonProcess mysite processes=3 threads=2 display-name=mod_wsgi
WSGIProcessGroup mysite
WSGIScriptAlias / /some/path/wsgi.py
The application code is:
def application(environ, start_response):
from foo import racer
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)
return [racer()] #call to racer creates a race condition?
module foo.py:
a = 1
def racer():
global a
a = a + 1
return str(a)
Did I just create a race condition with variable a
? I guess, a
is a module-level variable, that exists in foo.py
and is the same (shared) among threads?
More theoretical questions derived from this:
a
variable so my example is not thread-safe? Apache
, each thread of my application on Linux is created on C-level with pthreads
API and the function, which the pthread
must execute is some kind of python interpreter's main function? Or does Apache protect me somehow from this error? Tornado
's HTTPServer
? Web server, written in python, implements threads as python-level threading.Thread
objects, and runs application
function in each thread. So, I suppose it's a race condition? (I also suppose, in this case I can abstract from underlying C-level pthreads
below threading.Thread
implementation and worry only about python functions, because the interpreter won't allow me to modify C-level shared data and screw its functioning. So the only way to break thread-safety for me is to deal with global variables? Is that right?) Yes, you have a race condition there, but it's not related to the imports. The global state in foo.a
is subject to a data race between a + 1
and a = ...
; since two threads can see the same value for a
, and thus compute the same successor.
The import machinery itself does protect against duplicate imports by multiple threads, by means of a process wide lock (see imp.lock_held()
). Although this could, in theory, lead to a deadlock, this almost never happens, because few python modules lock other resources at import time.
This also suggests that it's probably safe to modify sys.path
at will; since this usually happens only at import time (for the purpose of additional imports), and so that thread is already holds the import lock, other threads cannot cause imports that would also modify that state.
Fixing the race in racer()
is quite easy, though:
import threading
a = 1
a_lock = threading.Lock()
def racer():
global a
with a_lock:
my_a = a = a + 1
return str(my_a)
which will be needed for any global, mutable state in your control.
Read the mod_wsgi documentation about the various processes/thread configurations and in particular what it says about data sharing.
In particular it says:
Where global data in a module local to a child process is still used, for example as a cache, access to and modification of the global data must be protected by local thread locking mechanisms.
链接地址: http://www.djcxy.com/p/86486.html上一篇: 为什么单个Python进程的CPU使用率可能超过100%?
下一篇: python + wsgi上多