Saturday, October 15

The Power to do it Wrong

I believe I am not alone in finding myself, more often then I might like, having to make less than ideal decisions in the world of software development. I find myself lamenting inconsistencies and warts, in tools, runtimes, and applications themselves. It is a universe of our own making, sometimes my own personal making, and often it seems to tend toward a proverbial hell rather than heaven.

The process of realizing something, manifesting it from the ideal abstract notion to concrete reality, is a process of compromise. The ideal world cares not of constraints, contradictions, complications, differing opinions, and arbitrary decisions of the past. The ideal world does not care that the human mind and body have physical limitations, and that we cannot sustain long between breaths, meals, and checking twitter. We may assume so, but the ideal world is not lenient, it knows no deviance from ideal elegance. It tolerates nothing that is not perfect, in fact it flatly refuses the existence of imperfection.

There are software tools, and systems, and quite popular ones at that, that seem bent on imposing an ideal model, a "one true way", if you will. I am tolerant of many things, I am not a man of overbearing opinion. I like to think that everyone is largely always wrong, myself included, and I find some pleasure in life watching that prove to be true. But, there is one philosophy that I simply cannot abide as a result, the one true way. This is not something I learned, as Lady Gaga would say, I was just born this way. But my personal inability to deal with dogma is not the point here.

Perfection is assumed to be a great and wonderful thing, how could it be not? What about all those quaint stories of falling from grace, the eternal struggle to regain that which we lost through our own greed and arrogance? The great temptation, the human condition, transcending evil, and all that? Wasn't software supposed to be "easy?" Some sort of mathematical paradise devoid of human blight? Yeah well, only the kind of software to which we deridingly refer, you know the kind with "no users."

Ok, so I'm saying perfection is bad? No, I say perfection is tyrannical, it is the ultimate one true way. However, I needn't let that throw me into an existential funk. Because, if there is one truth that I can hold dear, it is that perfection is impossible. And even if it were possible, I could not exist there, thus any resulting self-loathing is self-prevented.

But, crucially, the notion of perfection, of the ideal, of the simple, of the absence of compromise is not bad. In fact it is an essential human quality. It's a strange notion that chasing something impossible, something dubious, is noble, but that's what we do. Every developer that I know that's worth their salt wants to find the elusive ideal, or at minimum the least worst solution for the problem at hand. The mind of the master knows that perfection of the whole is impossible, yet will still strive to simplify, to idealize, to remove compromise where it can. Mastery itself is largely the ability to direct that effort most effectively, and to know when it is futile.

So, we must embrace that the universe, the software, the hardware, and the wetware is imperfect, and deal with it. And those tools, and technologies that best deal with that reality are the ones we should embrace. Often, it is not even a matter of embracing, but more of natural selection. Those tools that over-idealize, that preach the one true way, will eventually lose. Yes, worse is better.

So the next time you are working on a web page, and you find that no amount of dogmatic adherence to CSS best practices will fix your layout, be glad that the tools are bigger, dare I say better, than the ideal. Sure, you may say a few choice words when you have to use a <br/> or *gasp* a <table> tag. But think about what you would say if there were no workaround. If the tools assumed perfection, we'd all be screwed.

Great technologies are imperfect, they are flawed, they break their own rules. Great people are no different, though often their faces are more symmetrical than yours, but I digress. The point is that there will never be perfect software tools, and you should be wary of those purporting the one true way. Besides, perfection is boring, and imposed perfection is stifling. We need tools with abstractions that leak, with models that don't always map, and with escape valves that let us do it wrong. Because sometimes doing it wrong is the best way, the only way it can be done.

Tuesday, August 17

Running Tests for Multiple Pythons

For my Planar project, I'm committed to being compatible for Python 2.6-3.1 and beyond. Since this project contains C extensions, that means that I need to be able to build and test for each Python conveniently. And yes, different Pythons do behave differently, have different compilation gripes, and sometimes even expose different bugs due to internal differences in GC, etc.

This shell script I wrote is a simple, but effective solution. It requires that all of the versions of Python you are testing are installed (of course), and also have nose installed. It is very easy to extend to more Python versions as well by modifying the for loop at the top. You run it from the top-level source directory, alongside your setup.py. I imagine other folks have come up with more elegant solutions than this, but I figured it might be useful to someone, so here it is:
#!/bin/bash
# Build from scratch and run unit tests 
# from python 2.6, 2.7 and 3.1
# Then run doctests to verify doc examples
# Note: requires nose installed in each python instance

error=0

rm -rf build

for ver in 2.6 2.7 3.1; do
 echo "************"
 echo " Python $ver"
 echo "************"
 echo
 if which python${ver}; then
  python${ver} setup.py build && \
  python${ver} -m nose.core \
   -d -w build/lib.*${ver}/ --with-coverage $@ || error=1
 else
  echo >&2 "!!! Python ${ver} not found !!!"
  error=1
 fi
done

echo
echo -n "Doctests... "
srcdir=`pwd`
cd build/lib.*3.?/ && \
  python3 -m doctest ${srcdir}/doc/source/*.rst && \
  echo "OK" || error=1

exit $error

The bottom part runs doctests you might have in doc/source, which is where I keep the sphinx doc sources.

Monday, April 26

Planar 0.1



Planar is a 2D geometry library I'm working on for use by my Grease game engine. For maximum usefulness, however, I'm distributing planar as a separate library.

Since it doesn't have any external dependancies of its own that dictate otherwise, I took the liberty of making it compatible with both Python 2.6+ and 3.1+. Planar implements everything in both Python and C, so it was a good opportunity to finally jump into 3.x with both feet. I was pleased with how easily 2to3 worked to seamlessly take care of the necessary Python code mungification. I was also pleased with the C side of the world as well. Conceptually things haven't changed too much in 3.x, and a few #ifdefs here and there and some macros took care of the API differences there.

Once I got it initially working in Python 3, I found myself coding against it and fixing 2.x incompatibilities afterwards. I think that's a good sign.

The functionality of 0.1 is very basic, but it's fully documented and tested, so I'm satisfied that I'm ready to move on and implement more. Now that I've learned a bunch of the Python 3 ropes, I'm hopeful that I can make fast progress toward 0.2.

Head over the the planar docs site to learn more. Feedback is encouraged and appreciated.

Wednesday, April 14

Arg Parsing Micro-Benchmark

I'm working on a vector class for my shiny new Planar geometry Python library. When I was working on Lepton (which is mostly in C), I was keenly aware of the overhead of the generic arg parsing C-APIs (PyArg_ParseTupleAndKeywords and friends). Since this vector class largely consists of methods that do very little work, but may be called fairly often, I figured this overhead should be considered.

I implemented a class method to instantiate a vector with polar coordinates, which are converted to cartesian. Here it is in Python:
class Vec2(tuple):

    def __new__(self, x, y):
        return tuple.__new__(Vec2, 
            ((x * 1.0, y * 1.0)))

    @classmethod
    def polar(cls, angle, length=1.0):
        """Create a vector from polar coordinates"""
        radians = math.radians(angle)
        vec = tuple.__new__(cls, 
            (math.cos(radians) * length, 
             math.sin(radians) * length))
        return vec

Note, using tuple.__new__ in polar is an optimization that saves a layer of method calls and skips converting the x and y values to floats, since I know they are already. Running this through timeit with Python 3.1.2, I get:
>>> timeit.timeit('Vec2.polar(20, 10)', 
... 'from planar.vector import Vec2')
2.3068268299102783
>>> timeit.timeit('Vec2.polar(angle=20, length=10)', 
... 'from planar.vector import Vec2')
2.3426671028137207

Notice there's not much difference in timing between positional and keyword arguments.

Now let's implement the polar class method in C using generic arg parsing. Here's the method's code:
static PyObject *
Vec2_polar(PyTypeObject *type, PyObject *args, PyObject *kwargs)
{
    PlanarVec2Object *v;
    double angle;
    double length = 1.0;

    static char *kwlist[] = {"angle", "length", NULL};

    assert(PyType_IsSubtype(type, &PlanarVec2Type));
    if (!PyArg_ParseTupleAndKeywords(
        args, kwargs, "f|f:Vec2.polar()", 
        kwlist, &angle, &length)) {
        return NULL;
    }

    v = (PlanarVec2Object *)type->tp_alloc(type, 0);
    if (v != NULL) {
        angle = radians(angle);
        v->x = cos(angle) * length;
        v->y = sin(angle) * length;
    }
    return (PyObject *)v;
}

Here's the timing for the above:
>>> timeit.timeit('Vec2.polar(20, 10)', 
... 'from planar.cvector import Vec2')
1.045346975326538
>>> timeit.timeit('Vec2.polar(angle=20, length=10)', 
... 'from planar.cvector import Vec2')
1.578913927078247

This is certainly faster than the Python code, but not tremendously so, though that's not too surprising given the simplicity of this method. However, what is surprising is the performance difference between positional and keyword arguments. But, there it is.

Since this method takes two arguments, and they are both floats, maybe we can speed up the positional case, which should be common, by doing the arg parsing ourselves. We know that PyObject *args is a tuple, so let's take it apart manually and extract the angle and length. A slight complication is that the second length argument is optional, but we can handle it. Here's the beautiful result:
static PyObject *
Vec2_new_polar(PyTypeObject *type, PyObject *args, PyObject *kwargs)
{
    PyObject *angle_arg;
    PyObject *length_arg;
    PlanarVec2Object *v;
    int arg_count;
    double angle;
    double length = 1.0;

    static char *kwlist[] = {"angle", "length", NULL};

    assert(PyType_IsSubtype(type, &PlanarVec2Type));
    if (kwargs == NULL) {
        /* No kwargs, do fast manual arg handling */
        arg_count = PyTuple_GET_SIZE(args);
        if (arg_count != 1 && arg_count != 2) {
            PyErr_SetString(PyExc_TypeError, 
                "Vec2.polar(): wrong number of arguments");
            return NULL;
        }
        angle_arg = PyTuple_GET_ITEM(args, 0);
        if (!PyNumber_Check(angle_arg)) {
            PyErr_SetString(PyExc_TypeError, 
                "Vec2.polar(): expected number for argument angle");
            return NULL;
        }
        angle_arg = PyNumber_Float(angle_arg);
        if (angle_arg == NULL) {
            return NULL;
        }
        angle = PyFloat_AS_DOUBLE(angle_arg);
        Py_CLEAR(angle_arg);
        if (arg_count == 2) {
            length_arg = PyTuple_GET_ITEM(args, 1);
            if (!PyNumber_Check(length_arg)) {
                PyErr_SetString(PyExc_TypeError, 
                    "Vec2.polar(): expected number for argument length");
                return NULL;
            }
            length_arg = PyNumber_Float(length_arg);
            if (length_arg == NULL) {
                Py_DECREF(angle_arg);
                return NULL;
            }
            length = PyFloat_AS_DOUBLE(length_arg);
            Py_CLEAR(length_arg);
        }
    } else if (!PyArg_ParseTupleAndKeywords(
        args, kwargs, "f|f:Vec2.polar()", 
        kwlist, &angle, &length)) {
        return NULL;
    }

    v = (PlanarVec2Object *)type->tp_alloc(type, 0);
    if (v != NULL) {
        angle = radians(angle);
        v->x = cos(angle) * length;
        v->y = sin(angle) * length;
    }
    return (PyObject *)v;
}

So, if kwargs is NULL, then we extract the angle, and the length if present from the args tuple. We check if each are numbers, convert them to float objects (so you can pass in ints without complaint), then we extract the C double from each to store in the vector object's struct. Sure looks like a lot of code, let's see if it buys us anything:

>>> timeit.timeit('Vec2.polar(20, 10)', 
... 'from planar.cvector import Vec2')
0.46842408180236816 
Bam! Not bad at all! Of course if you do pass keyword args, we fall back to generic parsing, so that should perform the same as the last one:

>>> timeit.timeit('Vec2.polar(angle=20, length=10)', 
... 'from planar.cvector import Vec2')
1.5973320007324219
If I were feeling ambitious, I could extract the args from the PyObject *kwargs dict manually as well, and see some speedup. But at some point the added code and maintenance cost isn't worth it. I know the opportunity is there if needed, but also that passing args positionally is already a lot faster (a fact that is likely worth documenting).

I'm writing this planar library such that every class is coded in Python and C. Also the library is compatible with both Python 2.6+ and Python 3.x. If you're interested in an example of how to write a package that's compatible with Python 2 and 3 with C-extensions, check it out.

Monday, March 22

Frameworks, Libraries, Namespaces and Distributions

I've been busy not working on the code for my game engine framework Grease recently. I've actually been working on the documentation, but since I can't help myself, I've been thinking a good bit about the code, and specifically what I want to work on next.

The functionality that I plan to work on next is rather general 2D geometry and vector graphics. The important thing is that although Grease will depend on these things, they are not part of the framework per se. They should be useful for folks who don't buy into Grease's framework dogma, or don't need its features. Simply put they will be libraries that could be perfectly happy outside of Grease or any framework.

So, humble denizens of the Python planet, what is the best approach to naming and packaging these libraries for distribution? I see two possible options, both with pros and cons:
  1. Put them under the grease "brand" packaging them as something like: grease.geometry and grease.vg. They would be distributed separately and probably together with grease the framework as well.
  2. Give them their own independent name (I'm leaning toward flatly) and package them separately from grease, though grease would depend on them.
#1 has some advantages:
    • Less "brand" complexity. It's obvious that grease.vg is part of the grease project, and when used inside of grease there's one less arbitrary brand-name to remember.
    • It's easy to envision these libraries fitting snugly with the rest of the framework in terms of documentation and what not.
    • It's pretty easy to justify folding the release cycle of the libraries in that of the framework.
    And some disadvantages:
    • The independence of these subpackages is unclear, folks may be reluctant to use them if they feel that they carry too much framework baggage or cognitive complexity.
    • I'm not sure the best way to actually make these separate from a distutils perspective, or whatever distribution flavor of the minute is in fashion. I suspect they should just live in their own directory trees and get stitched into the grease top-level package at installation-time. I'm wary of drawbacks and unintended side-affects of that, however.
    • They're nested rather than flat (sorry Tim).
    #2 also has some advantages:
    • The independence of the library package(s) is obvious.
    •  It's rather obvious how to organize them from a distribution perspective.
    • They aren't weighed down by any framework connotations.
    • The namespace is flatter.
    And of course disadvantages:
    • They don't promote the framework "brand" in any way and may be perceived as less well integrated with the framework.
    • They would feel like more independent creatures to me, thus I think I would tend to fuss about them individually more. That's a subjective thing, of course, and not necessarily bad for users.
    • It's not as clear to me how to deal with them as grease dependencies. Bundle them or fetch them at installation time?
    Honestly I'm able to justify both ways to myself, and of course it's not a matter of life-and-death. But it does feel like an important api decision and I value any opinion or ideas you may have on the matter. So please comment if you feel inclined.

    Also I'm interested in people's opinions on bundling vs. fetching dependencies at installation. In the bad old days, bundling was basically a no-brainer with Python, but given things like setuptools/distribute and buildout, the latter has gained appeal. Which do you prefer?

    Sunday, February 28

    Grease 0.1 Released

    I released the inaugural 0.1 release of my game engine Grease today. This is basically a throw-it-over-the-wall release so that I can use the code and push it further during the upcoming pyweek compo at the end of March. That said it is complete enough to implement an entire game, an example of which is included in the distribution. Obligatory screenshots below:



    The game uses Grease features such as its awesomely-retro vector polygon renderer, collision detection, decorator-based key binding and mode management for hot-seat multiplayer. The entire game is a single script that weighs in about 550 lines, which includes whitespace and docstrings. Even so no real effort is made to make the code short, it was written with clarity in mind, not brevity. Most of the code savings comes from the abstractions available in Grease.

    At the moment, Grease is implemented entirely in Python and sits on top of Pyglet. My upcoming efforts on this project will be to make the blasteroids example into a full tutorial and get everything fully documented. After that I will be creating more complex example games and adding native-code parts to Grease where needed. The intention is to always have pure-Python versions of everything available though, and the Python versions will be developed first.

    Of course you are welcome and encouraged to try it for yourself. You can download it from pypi here:

    http://pypi.python.org/pypi/grease

    The code of the example blasteroids game above can be scrutinized here:

    blasteroids.py

    Enjoy.

    Saturday, January 16

    Decorate Thy Keyboard Controls

    As part of a game engine called Grease I'm working on (more on that sometime soon) I was thinking about ways to create a clean and efficient api for handling keyboard events. Too often this ubiquitous section of game code consists of a tangle of if/elif statements, a construct that is a personal pet peeve of mine. So to avoid that mess, a typical strategy I've employed is to map keys to methods using a dictionary for easy dispatch. Doing this for one-off applications is simple enough, if not particularly clean and tidy.

    Anyway taking a step back for a sec, let's examine some goals. Basically what I'm after is a way to define some methods (or even functions) that get executed in response to key events. Specifically there are three types of key events that I'm interested in: key press, key release and key hold. The first two get dispatched once per key "stroke" as you'd expect. The key hold event fires every game "tick" that a key remains down; useful for continuous functions like thrust, etc.

    So I need a clean way to map methods to specific keys and key event types. Sounds like a job for: decorators! Truth be told I haven't felt the need to write many decorators, but after prototyping a non-decorator version that came out less than clean even though it only supported one type of key event, I thought I'd give decorators a go. Below is an example of how to implement key controls using what I've cooked up:

    class PlayerControls(KeyControls):
    
        @KeyControls.key_press(key.LEFT)
        def start_turn_left(self):
            ship.rotation = -ship.turn
        
        @KeyControls.key_release(key.LEFT)
        def stop_turn_left(self):
            if ship.rotation < 0:
                ship.rotation = 0
    
        @KeyControls.key_press(key.RIGHT)
        def start_turn_right(self):
            ship.rotation = ship.turn
        
        @KeyControls.key_release(key.RIGHT)
        def stop_turn_right(self):
            if ship.rotation > 0:
                ship.rotation = 0
    
        @KeyControls.key_hold(key.UP)
        def thrust(self, dt):
            ship.body.apply_local_force(
                ship.player.thrust * dt)
        
        @KeyControls.key_press(key.P)
        def pause(self, dt):
            global paused
            paused = not paused
    
    Here's the code needed to wire this into pyglet:
    
    window = pyglet.window.Window()
    controls = KeyControls(window)
    pyglet.clock.schedule_interval(controls.run, 1.0/60.0)
    pyglet.app.run()
    
    Though using the decorators binds the keys to specific methods of the class at compile-time, KeyControls also contains additional methods for changing the key bindings at run-time. I may also add support to load and store key bindings from a configuration file if that feature is needed.

    This implementation is designed to work with pyglet, though I think the same general approach could be used with pygame. The module for this, although part of a larger project I am working on, doesn't depend on anything else. Feel free to give it a try and see what you think.

    [Edit a new version of this module is now available]

    Get the module here.