Monday, September 7, 2009

Break the Cycle: Local Class Definitions

In the process of analyzing the memory behavior of your Python application, you will sooner or later stumble across reference cycles. It is always a good idea to avoid creating reference cycles, though not every cycle is worth breaking (using weak methods avoided some reference cycles but increased the invocation cost).

Debugging reference cycles can be simplified by creating a graphical representation of the reference graph, e.g. using graphviz. Marius Gedminas provides a set of tools to facilitate building graphs at his homepage. Similar facilities exist in Pympler. The latter improved considerably since the official 0.1 release so be sure to grab the version from the svn trunk.

When you know what objects are involved in cyclic dependencies you will want to know why these occurred in the first place, which is not always trivial to figure out. While working on the integration of Bottle (which is really great BTW) in Pympler, I stumbled across an interesting case:

import gc

def f():
class Foo(object):

from pympler.gui.garbage import GarbageGraph
GarbageGraph(reduce=True).render('cycle1.png', format='png')

This snippet creates the following reference cycle (click on the image to enlarge):

Apparently, defining a class in the local scope of a function or method creates a reference cycle. Lifting the class definition to the module level avoids the reference cycle. It is even more interesting that class objects create reference cycles by design when they go out of scope:

>>> import gc
>>> gc.disable()
>>> class Foo(object):
... pass
>>> del Foo
>>> gc.collect()

So what? Well, it is evidently beneficial to define classes in modules or other classes, and not in functions or methods.

No comments:

Post a Comment