20 November 2008

Python "global" weirdness

I'm sure there's a rational explanation for this behaviour, but at the moment is slightly baffling.

class SomeObj:
 def __init__(self):
  self.prop = 1

class SomeClass:
 global objInst
 def __init__(self):
  self.reference = objInst
  self.reference.prop += 1
  
if __name__ == '__main__':
 global objInst
 objInst = SomeObj()
 clsInst = SomeClass()
 print "original:" + str(objInst.prop)
 print "clsInst:" + str(clsInst.reference.prop)

# output:
# >>> x:2
# >>> y:2

This is all good, and expected: you get a reference to a global object and manipulate it. But what happens if you try to rebind that global name straight in the __init__ method?

[..]
class SomeClass:
 global objInst 
 def __init__(self):  
  self.reference = objInst
  self.reference.prop += 1
  objInst = SomeObj()
[..]  
# output:
Traceback (most recent call last):
  File "test.py", line 16, in 
    clsInst = SomeClass()
  File "test.py", line 9, in __init__
    self.reference = objInst
UnboundLocalError: local variable 'objInst' referenced before assignment

And it gets even more weird! If you move the "global objInst" declaration inside the __init__ method, it works as expected: objInst is bound to a new object with a different state from the one in self.reference. But if you keep the declaration at class level, and simply move the rebinding out in a new method (separated from __init__), you don't get an error but python does not bind the global to a new object.

I guess it's somehow all a matter of context, and it probably makes perfect sense to programming-language scientists; it just doesn't to me :)

2 comments:

Anonymous said...

Sigh. This is yet another Python quirk that really makes me appreciate how much more polished other scripting languages (*cough* Ruby *cough*) are.

I recently proposed a brief (2 hours) seminar on a scripting language for students on their second Fundamentals of Informatics course, and I couldn't really find a reason to pick Python instead of Ruby. The fact that, in the end, the seminar will be on C++, does not depend on my choice. :-)

Ehm, anyway. Indeed, the Ruby equivalent of the same Python code you tested works as expected. Section 6.13 in the Python manual does not clarify anything about that weird global behavior (which, perhaps because I'm just an engineer and not a scientist, does not make any sense to me).

Finally, two nitpicking observations: please use at least 2 spaces for indentation, it's just clearer; also, in the first output text, you wrote an "x" and "y" that do not appear anywhere in the code.

Unknown said...

Man... months later, I finally found the reason for this "weird" behaviour! (Hat tip to "Learning Python, 2nd Edition" which I have just borrowed from library: oldie but goldie, as the saying goes.)

So, the compiler is really to blame. It works like this: when a function/method (we all know that, in Python, methods are just functions in disguise in a different namespace) is compiled, as soon as an assignment to a name is encountered, that name gets defined as a local name to the scope of the function. When the function is executed, if that name is referenced before it is assigned a value, that UnboundLocalError gets raised.

In your last example, when SomeClass.__init__() is compiled, the compiler finds an assignment to objInst on the third line, and says "So, the name objInst represents a variable local to the scope of SomeClass.__init__()." Later, when SomeClass.__init__() is executed, the interpreter finds a reference to objInst on the first line, but it knows that objInst is a local variable (because the compiler has told him so) and it has not encountered an assignment/definition for it, thus raising the UnboundLocalError exception.

Now, this also clarifies why moving "global objInst" into SomeClass.__init__() solves the problem, overriding the "any assigned name is local" rule. And also explains why moving the rebinding in another method does not raise any exception: 1) objInst is no more resolved as a local variable, because you don't locally bind that name in SomeClass.__init__() anymore, so the interpreter searches for it in the global namespace, finding the class instance definition, and 2) in the new method where you moved the rebinding you don't use objInst before assignment anymore (I assume that the new method contains that line only) so no error has any reason to occur.

Oh, and now that everything is clear, fuck that Ruby praise off. ;-)