Friday, April 16, 2010
SQL for removing invalid foreign keys and typecasting in Postgres
DELETE FROM session_data WHERE session_data.name = 'user_id'
AND NOT EXISTS
(SELECT * FROM users WHERE CAST(session_data.value AS integer) = users.id);
Friday, January 8, 2010
Debugging nosetests with ipdb

Ever wanted to debug Python with tab completion and syntax highlighting? Then you'll love ipdb:
sudo easy_install ipdb
The only thing you have to do is use ipdb instead of pdb:
import ipdb; ipdb.set_trace()
With a little trick it will even work with nosetests:
import sys; sys.stdout = sys.__stdout__; import ipdb; ipdb.set_trace()
Monday, September 14, 2009
Monday, September 7, 2009
Break the Cycle: Local Class Definitions
In the process of analyzing the memory behavior of your Python application, you will sooner or later stumble across reference cycles. It is always a good idea to avoid creating reference cycles, though not every cycle is worth breaking (using weak methods avoided some reference cycles but increased the invocation cost).
Debugging reference cycles can be simplified by creating a graphical representation of the reference graph, e.g. using graphviz. Marius Gedminas provides a set of tools to facilitate building graphs at his homepage. Similar facilities exist in Pympler. The latter improved considerably since the official 0.1 release so be sure to grab the version from the svn trunk.
When you know what objects are involved in cyclic dependencies you will want to know why these occurred in the first place, which is not always trivial to figure out. While working on the integration of Bottle (which is really great BTW) in Pympler, I stumbled across an interesting case:
This snippet creates the following reference cycle (click on the image to enlarge):

Apparently, defining a class in the local scope of a function or method creates a reference cycle. Lifting the class definition to the module level avoids the reference cycle. It is even more interesting that class objects create reference cycles by design when they go out of scope:
So what? Well, it is evidently beneficial to define classes in modules or other classes, and not in functions or methods.
Debugging reference cycles can be simplified by creating a graphical representation of the reference graph, e.g. using graphviz. Marius Gedminas provides a set of tools to facilitate building graphs at his homepage. Similar facilities exist in Pympler. The latter improved considerably since the official 0.1 release so be sure to grab the version from the svn trunk.
When you know what objects are involved in cyclic dependencies you will want to know why these occurred in the first place, which is not always trivial to figure out. While working on the integration of Bottle (which is really great BTW) in Pympler, I stumbled across an interesting case:
import gc
gc.disable()
def f():
class Foo(object):
pass
f()
from pympler.gui.garbage import GarbageGraph
GarbageGraph(reduce=True).render('cycle1.png', format='png')
This snippet creates the following reference cycle (click on the image to enlarge):

Apparently, defining a class in the local scope of a function or method creates a reference cycle. Lifting the class definition to the module level avoids the reference cycle. It is even more interesting that class objects create reference cycles by design when they go out of scope:
>>> import gc
>>> gc.disable()
>>> class Foo(object):
... pass
>>> del Foo
>>> gc.collect()
6
So what? Well, it is evidently beneficial to define classes in modules or other classes, and not in functions or methods.
Tuesday, September 1, 2009
Substitute assert statements with unittest methods using vim
In the Python community, it's not generally agreed upon whether to use the assert statement or the assert* methods from the unittest module. As some commentators pointed out in a recent discussion, there are (a few) good reasons to prefer the assertion methods, e.g. better error messages.
Here are some vim substitution commands that make the transition from assert statements to the appropriate methods easier:
Here are some vim substitution commands that make the transition from assert statements to the appropriate methods easier:
:%s/assert \(.\+\) == \(.\+\)/self.assertEqual(\1, \2)/gc
:%s/assert \(.\+\) != \(.\+\)/self.assertNotEqual(\1, \2)/gc
:%s/assert \(.\+\)/self.assert_(\1)/gc
Wednesday, August 19, 2009
Convert Images to A4 PDF
Converting raster images to PDF in a printable format can be achieved using the ImageMagick convert utility with the page parameter:
The converter, however, not quite does what I expected. Images are resized to fill the A4 page but the aspect ratio is preserved and no margin is added. This actually leads to different sized pages for images with different ratios (which is common for scanned documents for example).
In order to create equal-sized PDF pages from a bunch of images, a margin or border needs to be added to the images. Doing this manually is a cumbersome process. Therefore, I've written a little Python script which adds a (white) border to the individual images to enforce an aspect ratio compatible with A4 pages. The script creates a PDF file from a bunch of image files with uniform A4 page size:
Save the script to img2a4pdf.py and invoke it like that:
Maybe someone will find it useful.
convert -page a4 *.png images.pdf
The converter, however, not quite does what I expected. Images are resized to fill the A4 page but the aspect ratio is preserved and no margin is added. This actually leads to different sized pages for images with different ratios (which is common for scanned documents for example).
In order to create equal-sized PDF pages from a bunch of images, a margin or border needs to be added to the images. Doing this manually is a cumbersome process. Therefore, I've written a little Python script which adds a (white) border to the individual images to enforce an aspect ratio compatible with A4 pages. The script creates a PDF file from a bunch of image files with uniform A4 page size:
import sys
from subprocess import Popen, PIPE
PAGE_WIDTH = 210.0
PAGE_HEIGHT = 297.0
files = [arg for arg in sys.argv[1:-1]]
output = sys.argv[-1]
tmp = ["a4%s" % f for f in files]
for f,t in zip(files, tmp):
p = Popen(["identify", f], stdout=PIPE)
dim = p.communicate()[0].split()[2]
w,h = [float(d) for d in dim.split('x')]
bw,bh = 0,0
if w/h < PAGE_WIDTH/PAGE_HEIGHT:
nw = PAGE_WIDTH * h / PAGE_HEIGHT
bw = int((nw - w) / 2)
else:
nh = PAGE_HEIGHT * w / PAGE_WIDTH
bh = int((nh - h) / 2)
Popen(["convert", "-border", "%dx%d" % (bw,bh),
"-bordercolor", "white", f, t]).communicate()
Popen(["convert", "-page", "a4"] + tmp + [output]).communicate()
Save the script to img2a4pdf.py and invoke it like that:
python img2a4pdf.py *.png output.pdf
Maybe someone will find it useful.
Subscribe to:
Posts (Atom)