Monday, April 26

Planar 0.1



Planar is a 2D geometry library I'm working on for use by my Grease game engine. For maximum usefulness, however, I'm distributing planar as a separate library.

Since it doesn't have any external dependancies of its own that dictate otherwise, I took the liberty of making it compatible with both Python 2.6+ and 3.1+. Planar implements everything in both Python and C, so it was a good opportunity to finally jump into 3.x with both feet. I was pleased with how easily 2to3 worked to seamlessly take care of the necessary Python code mungification. I was also pleased with the C side of the world as well. Conceptually things haven't changed too much in 3.x, and a few #ifdefs here and there and some macros took care of the API differences there.

Once I got it initially working in Python 3, I found myself coding against it and fixing 2.x incompatibilities afterwards. I think that's a good sign.

The functionality of 0.1 is very basic, but it's fully documented and tested, so I'm satisfied that I'm ready to move on and implement more. Now that I've learned a bunch of the Python 3 ropes, I'm hopeful that I can make fast progress toward 0.2.

Head over the the planar docs site to learn more. Feedback is encouraged and appreciated.

Wednesday, April 14

Arg Parsing Micro-Benchmark

I'm working on a vector class for my shiny new Planar geometry Python library. When I was working on Lepton (which is mostly in C), I was keenly aware of the overhead of the generic arg parsing C-APIs (PyArg_ParseTupleAndKeywords and friends). Since this vector class largely consists of methods that do very little work, but may be called fairly often, I figured this overhead should be considered.

I implemented a class method to instantiate a vector with polar coordinates, which are converted to cartesian. Here it is in Python:
class Vec2(tuple):

    def __new__(self, x, y):
        return tuple.__new__(Vec2, 
            ((x * 1.0, y * 1.0)))

    @classmethod
    def polar(cls, angle, length=1.0):
        """Create a vector from polar coordinates"""
        radians = math.radians(angle)
        vec = tuple.__new__(cls, 
            (math.cos(radians) * length, 
             math.sin(radians) * length))
        return vec

Note, using tuple.__new__ in polar is an optimization that saves a layer of method calls and skips converting the x and y values to floats, since I know they are already. Running this through timeit with Python 3.1.2, I get:
>>> timeit.timeit('Vec2.polar(20, 10)', 
... 'from planar.vector import Vec2')
2.3068268299102783
>>> timeit.timeit('Vec2.polar(angle=20, length=10)', 
... 'from planar.vector import Vec2')
2.3426671028137207

Notice there's not much difference in timing between positional and keyword arguments.

Now let's implement the polar class method in C using generic arg parsing. Here's the method's code:
static PyObject *
Vec2_polar(PyTypeObject *type, PyObject *args, PyObject *kwargs)
{
    PlanarVec2Object *v;
    double angle;
    double length = 1.0;

    static char *kwlist[] = {"angle", "length", NULL};

    assert(PyType_IsSubtype(type, &PlanarVec2Type));
    if (!PyArg_ParseTupleAndKeywords(
        args, kwargs, "f|f:Vec2.polar()", 
        kwlist, &angle, &length)) {
        return NULL;
    }

    v = (PlanarVec2Object *)type->tp_alloc(type, 0);
    if (v != NULL) {
        angle = radians(angle);
        v->x = cos(angle) * length;
        v->y = sin(angle) * length;
    }
    return (PyObject *)v;
}

Here's the timing for the above:
>>> timeit.timeit('Vec2.polar(20, 10)', 
... 'from planar.cvector import Vec2')
1.045346975326538
>>> timeit.timeit('Vec2.polar(angle=20, length=10)', 
... 'from planar.cvector import Vec2')
1.578913927078247

This is certainly faster than the Python code, but not tremendously so, though that's not too surprising given the simplicity of this method. However, what is surprising is the performance difference between positional and keyword arguments. But, there it is.

Since this method takes two arguments, and they are both floats, maybe we can speed up the positional case, which should be common, by doing the arg parsing ourselves. We know that PyObject *args is a tuple, so let's take it apart manually and extract the angle and length. A slight complication is that the second length argument is optional, but we can handle it. Here's the beautiful result:
static PyObject *
Vec2_new_polar(PyTypeObject *type, PyObject *args, PyObject *kwargs)
{
    PyObject *angle_arg;
    PyObject *length_arg;
    PlanarVec2Object *v;
    int arg_count;
    double angle;
    double length = 1.0;

    static char *kwlist[] = {"angle", "length", NULL};

    assert(PyType_IsSubtype(type, &PlanarVec2Type));
    if (kwargs == NULL) {
        /* No kwargs, do fast manual arg handling */
        arg_count = PyTuple_GET_SIZE(args);
        if (arg_count != 1 && arg_count != 2) {
            PyErr_SetString(PyExc_TypeError, 
                "Vec2.polar(): wrong number of arguments");
            return NULL;
        }
        angle_arg = PyTuple_GET_ITEM(args, 0);
        if (!PyNumber_Check(angle_arg)) {
            PyErr_SetString(PyExc_TypeError, 
                "Vec2.polar(): expected number for argument angle");
            return NULL;
        }
        angle_arg = PyNumber_Float(angle_arg);
        if (angle_arg == NULL) {
            return NULL;
        }
        angle = PyFloat_AS_DOUBLE(angle_arg);
        Py_CLEAR(angle_arg);
        if (arg_count == 2) {
            length_arg = PyTuple_GET_ITEM(args, 1);
            if (!PyNumber_Check(length_arg)) {
                PyErr_SetString(PyExc_TypeError, 
                    "Vec2.polar(): expected number for argument length");
                return NULL;
            }
            length_arg = PyNumber_Float(length_arg);
            if (length_arg == NULL) {
                Py_DECREF(angle_arg);
                return NULL;
            }
            length = PyFloat_AS_DOUBLE(length_arg);
            Py_CLEAR(length_arg);
        }
    } else if (!PyArg_ParseTupleAndKeywords(
        args, kwargs, "f|f:Vec2.polar()", 
        kwlist, &angle, &length)) {
        return NULL;
    }

    v = (PlanarVec2Object *)type->tp_alloc(type, 0);
    if (v != NULL) {
        angle = radians(angle);
        v->x = cos(angle) * length;
        v->y = sin(angle) * length;
    }
    return (PyObject *)v;
}

So, if kwargs is NULL, then we extract the angle, and the length if present from the args tuple. We check if each are numbers, convert them to float objects (so you can pass in ints without complaint), then we extract the C double from each to store in the vector object's struct. Sure looks like a lot of code, let's see if it buys us anything:

>>> timeit.timeit('Vec2.polar(20, 10)', 
... 'from planar.cvector import Vec2')
0.46842408180236816 
Bam! Not bad at all! Of course if you do pass keyword args, we fall back to generic parsing, so that should perform the same as the last one:

>>> timeit.timeit('Vec2.polar(angle=20, length=10)', 
... 'from planar.cvector import Vec2')
1.5973320007324219
If I were feeling ambitious, I could extract the args from the PyObject *kwargs dict manually as well, and see some speedup. But at some point the added code and maintenance cost isn't worth it. I know the opportunity is there if needed, but also that passing args positionally is already a lot faster (a fact that is likely worth documenting).

I'm writing this planar library such that every class is coded in Python and C. Also the library is compatible with both Python 2.6+ and Python 3.x. If you're interested in an example of how to write a package that's compatible with Python 2 and 3 with C-extensions, check it out.