Monday, September 14, 2009

Wanda's Wisdom

You may be gone tomorrow, but that doesn't mean that you weren't here today.

Monday, September 7, 2009

Break the Cycle: Local Class Definitions

In the process of analyzing the memory behavior of your Python application, you will sooner or later stumble across reference cycles. It is always a good idea to avoid creating reference cycles, though not every cycle is worth breaking (using weak methods avoided some reference cycles but increased the invocation cost).

Debugging reference cycles can be simplified by creating a graphical representation of the reference graph, e.g. using graphviz. Marius Gedminas provides a set of tools to facilitate building graphs at his homepage. Similar facilities exist in Pympler. The latter improved considerably since the official 0.1 release so be sure to grab the version from the svn trunk.

When you know what objects are involved in cyclic dependencies you will want to know why these occurred in the first place, which is not always trivial to figure out. While working on the integration of Bottle (which is really great BTW) in Pympler, I stumbled across an interesting case:

import gc
gc.disable()

def f():
class Foo(object):
pass
f()

from pympler.gui.garbage import GarbageGraph
GarbageGraph(reduce=True).render('cycle1.png', format='png')


This snippet creates the following reference cycle (click on the image to enlarge):



Apparently, defining a class in the local scope of a function or method creates a reference cycle. Lifting the class definition to the module level avoids the reference cycle. It is even more interesting that class objects create reference cycles by design when they go out of scope:

>>> import gc
>>> gc.disable()
>>> class Foo(object):
... pass
>>> del Foo
>>> gc.collect()
6


So what? Well, it is evidently beneficial to define classes in modules or other classes, and not in functions or methods.

Tuesday, September 1, 2009

Substitute assert statements with unittest methods using vim

In the Python community, it's not generally agreed upon whether to use the assert statement or the assert* methods from the unittest module. As some commentators pointed out in a recent discussion, there are (a few) good reasons to prefer the assertion methods, e.g. better error messages.

Here are some vim substitution commands that make the transition from assert statements to the appropriate methods easier:

:%s/assert \(.\+\) == \(.\+\)/self.assertEqual(\1, \2)/gc
:%s/assert \(.\+\) != \(.\+\)/self.assertNotEqual(\1, \2)/gc
:%s/assert \(.\+\)/self.assert_(\1)/gc