How shall I understand this deep binding example in Python?
From Programming Language Pragmatics by Scott,
Figure 3.14 Deep binding in Python. At right is a conceptual view of the run-time stack. Referencing environments captured in closures are shown as dashed boxes and arrows. When B is called via formal parameter P, two instances of I exist. Because the closure for P was created in the initial invocation of A , B's static link (solid arrow) points to the frame of that earlier invocation. B uses that invocation's instance of I in its print statement, and the output is a 1.
The catch is that a running program may have more than one instance of an object that is declared within a recursive subroutine. A closure in a language with static scoping captures the current instance of every object, at the time the closure is created. When the closure's subroutine is called, it will find these captured instances, even if newer instances have subsequently been created by recursive calls.
So basically, the quote tries to explain that the following program (which is the same as the one in the screenshot) prints out 1
:
def A(I, P):
def B():
print(I)
# body of A:
if I > 1:
P()
else:
A(2, B)
def C():
pass # do nothing
A(1, C) # main program
I don't quite understand the reason being "because the closure for P was created in the initial invocation of A, B's static link (solid arrow) points to the frame of that earlier invocation", and "When the closure's subroutine is called, it will find these captured instances". So I modify the example, and then the new example prints out 2
instead of 1
:
def A(I, P):
def B():
print(I)
if I > 2:
P()
elif I > 1:
A(3, B)
else:
A(2, B)
def C():
pass
A(1, C)
Another modified example prints 1
:
def A(I, P):
def B():
print(I)
if I > 2:
P()
elif I > 1:
A(3, P)
else:
A(2, B)
def C():
pass
A(1, C)
So how can I know which closure matters?
Generally, is a closure created whenever a function is passed as an argument to another function?
Thanks.
This has become too long for a comment, hence I am adding it as an answer.
I should say that I am answering this from the perspective of someone who learned about these ideas in other languages: I mostly write Python now, but it's possible my terminology is 'wrong' (for which read 'right, but latter-day languages like Python got it wrong'...). In particular I have intentionally skated over a bunch of Python-specific detail, and avoided dealing with things like like the mutability of bindings and the Python 3 nonlocal
hack).
I also think the book's usage of the term 'deep binding' is confusing and perhaps wrong: see the end for a note on this. I have therefore largely ignored it.
Bindings, scope and extent
There are three important concepts.
try: ... except: ...
constructs) but I'll only talk about variable bindings. There are several options for scope and extent. For variable bindings, Python has:
(Another common scope is dynamic : a binding with dynamic scope is visible to any code for which it is visible in the source code and any code that is 'down the stack' from that code. Another common extent is definite , which means that a binding goes away the moment control leaves the construct which established it. Bindings for exception handlers have dynamic scope and definite extent in Python, for instance.)
What lexical scope means is that you can (almost) tell by reading the source what bindings a bit of code is referring to.
So consider this:
def outer():
a = 2
def inner(b):
return a + b
return inner(2)
There are two bindings here: a
is bound in outer
and b
is bound in inner
(actually, there are three: inner
is also bound, to a function, in outer
). Each of these two bindings is referenced once: in inner
(and the inner
binding is referenced once, in outer
).
And the important thing is that, by reading the code, you can tell what the reference to a
in inner
is to: it's to the binding established in outer
. That's what 'lexical' means: you (and the compiler, if it is smart enough) can tell what bindings exist by looking at the source.
This is only almost true. Consider this fragment:
def maybe(x):
return x + y
There is one binding created, in maybe
, but two references: x
is a reference to a binding which is not known to exist. But it may exist: there may be a top-level binding of x
which would make this code work. So there is a special caveat around lexical bindings: there is a 'top-level' environment which all definitions can 'see' and which can contain bindings. So if the fragment above was enlarged to read
x = 4
def maybe(x):
return x + y
Then this code is fine, because maybe
can 'see' the top-level environment (really, in Python, this is the bindings in the module in which it's defined).
Indefinite extent & first-class functions
For the above examples, the results would be the same with definite or indefinite extent. This stops being true if you consider first-class functions, which Python has. It stops being the case because functions are objects which, by being called, can reference bindings. Consider this:
def outer(x):
def inner(y):
return x + y
return inner
There are three bindings here: outer
binds x
and inner
, and inner
binds y
(and can see x
and inner
). So, now, let add4 = outer(4)
: what should add4(3)
return (or, equivalently, what should outer(4)(3)
return)? Well, the answer is 7
. And this is the case because the binding of x
exists for as long as it can be referenced, or in other words, it exists for as long as any instances of inner
exist, because they reference it. This is what 'indefinite extent' means.
(If Python had only definite extent, then outer(4)(3)
would be some kind of error, since it would reference a binding which no longer exists. Languages with only definite extent can't really have first-class functions in any useful way.)
Something that's important to understand is that lexical scope tells you which bindings are visible, but the actual instances of those bindings which are visible are, of course, dynamic. So, if you consider something like this:
def adder(n):
return lambda e: e + n
a1 = adder(12)
a2 = adder(15)
then a1
and a2
reference different bindings of n
: a1(0)
is 12
while a2(0)
is 15
. So by reading the source code you can tell which bindings are captured, but you need to run it to know which instances of them are captured -- what the values of the variables are, in other words.
Compare that with this:
def signaller(initial):
s = [initial]
def setter(new):
s[0] = new
return new
def getter():
return s[0]
return (setter, getter)
(str, gtr) = signaller(0)
Now, str
and gtr
capture the same binding of s
, so str(1)
will cause gtr()
to return 1
.
Closures
So that's actually all there is to know. Except that there is some special terminology which people use, in particular the term 'closure'.
A closure is simply a function which refers to some lexical bindings outside its own definition. Such a function is said to 'close over' these bindings.
I think it would be a good question to ask why this term is needed at all? All you actually need to understand is the scope rules, from which everything else follows: why do you need this special term? I think the reason is partly historical and partly pragmatic:
The example code
The example code was
def A(I, P):
def B():
print(I)
# body of A:
if I > 1:
P()
else:
A(2, B)
def C():
pass # do nothing
A(1, C) # main program
So, when A
is called, there is a local function, B
which can see the binding of I
(and also P
& B
itself, but it does not refer to these so we can ignore them). Each call to A
creates new bindings for I
, P
& B
, and these bindings are different for each call. This includes recursive calls, which is the trick that is being done here to confuse you.
So, what does A(1, C)
, do?
I
to 1
, and B
to a closure which can see this binding of I
. It also binds P
to the global (module) value of C
, which is a function, but nothing refers to this binding. I
is 1
) with arguments of 2
& the value of B
, which is the closure that just got created. I
is now bound to 2
, and P
is bound to the closure from the outer call. B
. Nothing refers to this binding. P
is then called. It is the closure created in the outer call, and the binding it can see for I
is the binding visible to it, whose value is 1
. So it prints 1
, and we're done. You can see what is happening by changing the definition to print some useful information:
from __future__ import print_function # Python 2
def A(I, P):
def B():
print(I)
print("{I} {P} {B}".format(I=I, P=P, B=B))
if I > 1:
P()
else:
A(2, B)
def C():
pass
A(1, C)
This prints (for example):
1 <function C at 0x7f7a03768e60> <function B at 0x7f7a03768d70>
recursing with (2, <function B at 0x7f7a03768d70>)
2 <function B at 0x7f7a03768d70> <function B at 0x7f7a03651a28>
calling <function B at 0x7f7a03768d70>
1
Note that, in the inner call, there are two functions which identify themselves as B
: one of them is the same as the function created in the outer call (which will be called), while the other one is the just-created closure, which is never referenced again.
Deep binding
I think the book's usage of the term 'deep binding' is at best confusing and in fact probably outright wrong: it is possible that this term has changed meaning, but it certainly did not originally mean what the book thinks it means.
The terms 'deep binding' and 'shallow binding' refer to implementation strategies for languages with dynamic scope. In a system with dynamic scope, a 'free' variable reference (one which is not bound by a particular bit of code) is looked up by searching, dynamically, up the call stack until a binding for it is found. So in a language with dynamic binding you can't tell by looking at a bit of code what bindings it can see (and neither can the compiler!), because the bindings it can see depend on what the call stack looks like at the moment it is run.
Dynamic binding is great for things like exception handlers but usually terrible for most variable bindings.
One reason it's terrible is that a naïve implementation technique makes it inherently slow, and a clever implementation technique needs to be very clever to work in a system with more than one thread of control.
Deep binding is the naïve implementation technique. In deep binding, variable access works the way you think: the system searches up the stack until it finds the binding for the name it's looking for, and then uses that. If the stack is deep and the binding is a long way up it, this is slow.
Shallow binding is the clever implementation technique. This works by, instead of storing the current binding in the stack, storing the previous value in the stack, and smashing the current value into a slot associated with the variable name which is always in the same place. So now looking up a binding just involves looking up the name: there is no search. But creating and destroying bindings may be slower since the old values need to be stashed away. Additionally, shallow binding is not obviously safe in the presence of multiple threads of control: if all the threads share the slot for the binding then catastrophe will follow. So instead each thread needs to have its own slot, or slots need to be indexed by thread as well as name somehow.
I suspect that, for cases where dynamic scope is used such as exception handlers, systems use deep binding because it is much simpler to get right in multithreaded systems and performance is not critical.
Here is the classic early paper on deep & shallow binding, by Henry Baker.
链接地址: http://www.djcxy.com/p/51128.html上一篇: 变量在python中?