I\'m trying to speed up a piece of code that generates all possible splits of a string.
splits(\'foo\') -> [(\'f\', \'oo\'), (\'fo\', \'o\'), (\'foo\', \'
This isn't the sort of problem that Cython tends to help with much. It's using slicing, which ends up largely the same speed as pure Python (i.e. actually pretty good).
Using a 100 character long byte string (b'0'*100
) and 10000 iterations in timeit
I get:
cdef int i
and compiled in Cython - 0.20s (this is reproducably a small improvement. It's more significant with longer strings)cdef int i
and the parameter typed to bytes text
- 0.28s (i.e. worse).Best speed is got by using the Python C API directly (see code below) - 0.11s. I've chosen to do this mostly in Cython (but calling the API functions myself) for convenience, but you could write very similar code in C directly with a little more manual error checking. I've written this for the Python 3 API assuming you're using bytes objects (i.e. PyBytes
instead of PyString
) so if you're using Python 2, or Unicode and Python 3 you'll have to change it a little.
from cpython cimport *
cdef extern from "Python.h":
# This isn't included in the cpython definitions
# using PyObject* rather than object lets us control refcounting
PyObject* Py_BuildValue(const char*,...) except NULL
def split(text):
cdef Py_ssize_t l,i
cdef char* s
# Cython automatically checks the return value and raises an error if
# these fail. This provides a type-check on text
PyBytes_AsStringAndSize(text,&s,&l)
output = PyList_New(l)
for i in range(l):
# PyList_SET_ITEM steals a reference
# the casting is necessary to ensure that Cython doesn't
# decref the result of Py_BuildValue
PyList_SET_ITEM(output,i,
<object>Py_BuildValue('y#y#',s,i+1,s+i+1,l-(i+1)))
return output
If you don't want to go all the way with using the C API then a version that preallocates the list output = [None]*len(text)
and does a for-loop rather than a list comprehension is marginally more efficient than your original version - 0.18s
In summary, just compiling it in Cython gives you a decent speed up (a bit less than 2x) and setting the type of i
helps a little. This is all you can really achieve with Cython conventionally. To get full speed you basically need to resort to using the Python C API directly. That gets you a little under a 4x speed up which I think is pretty decent.