Jianshu.com does not support MathJax, to view the math equations, go to https://www.zybuluo.com/kailaix/note/486830.
In this short article, I'd like to introduce the minimal effort to write C extension for Python.
Why do we have to write C extension for Python? In most times, Python does well on the calculations we want to make. However, sometimes we'd like to speed up our program by improving the most time-consuming and CPU-consuming parts. Switching to C programming might be a remedy but coding in pure C will be miserable. In such a situation, writing C extension is definitely a good idea.
There are might ways to write C extension. For example, swig
, Cython
and so on are great tools for such a purpose. But we will focus on the primitive "Python.h way" in this article to achieve the goal.
The Problem
We will introduce the method by solving the following problem:
Consider the following differential equation:
$$\frac{dy}{dt} = y$$
with initial condition $y(0)=1$. Calculate $y(1)$ and $y(2)$.
The answer is obvious as we can solve the problem analyticall: $y(t)=e^t$. But to illustrate, we will use backward Euler's method to solve it numerically. The method reads:
$$y_0=1, y_{n+1}=\frac{1}{1-\Delta t}y_n$$
that is, the step size is $\Delta t$. Let $\Delta t=\frac{1}{N}$, then $y_N \approx y(1)$; also when $\Delta t=\frac{2}{N}$, $y_N \approx y(2)$.
The Struction of C Source Code
Like MATLAB mex function, Python.h
provides us with power APIs to write C extension. In a typical C source code to be integrated into Python program, there are four parts:
- C functions you want to integrate into Python program.
- A converter to coordinate the data structures of C and Python.
- A table mapping the names of your functions as Python developers see them to C functions inside the extension module.
- An initialization function.
C functions
The first part consists of primitive C functions like
double _value_at_one(int n){
double dt = 1.0/n;
double f = 1;
for(int i=0;i<n;i++)
{
f = 1.0/(1-dt)*f;
}
return f;
}
double _value_at_two(int n){
double dt = 2.0/n;
double f = 1;
for(int i=0;i<n;i++)
{
f = 1.0/(1-dt)*f;
}
return f;
}
This is pure C and we do not have to worry about interations between C and Python.
converter
static PyObject *
value_at_one(PyObject *self, PyObject *args)
{
int n;
if (!PyArg_ParseTuple(args,"i",&n)){
return NULL;
}
return Py_BuildValue("d",_value_at_one(n));
}
static PyObject *
value_at_two(PyObject *self, PyObject *args)
{
int n;
if (!PyArg_ParseTuple(args,"i",&n)){
return NULL;
}
return Py_BuildValue("d",_value_at_two(n));
}
In this part, we wrap the C functions in a "converter". There are three kinds of converters, which differs only in the arguments they take: besides self arguments, functions with no arguments,functions with tuple arguments and functions with key word arguments.
For the second case, the standard API to parse the arguments is PyArg_ParseTuple
function.
Py_BuildValue
constructs a Python data structure from C data structure.
mapping table
Mapping table is a array of structures that assemble all the functions in a single container.
static PyMethodDef calculate_methods[] = {
/* The cast of the function is necessary since PyCFunction values
* only take two PyObject* parameters, and keywdarg_parrot() takes
* three.
*/
{"value_at_one", (PyCFunction)value_at_one, METH_VARARGS,
"This is a demo"},
{"value_at_two", (PyCFunction)value_at_two, METH_VARARGS,
"value_at_two demo"},
{NULL, NULL, 0, NULL} /* sentinel */
};
Every struction consists of the function name you want to expose to the Python developers, a pointer to the function constructed in the last step, a flag to identify the type of the function and a help string. The last structure must be something like the example above, marking the end of the table.
initialization function
It is a routine to initialize the function at the end of the source code.
PyMODINIT_FUNC
initdiff_calulate(void)
{
/* Create the module and add the functions */
(void)Py_InitModule("diff_calulate", calculate_methods);
}
diff_calculate
will then be the package name. The initialization function name is init<PACKAGE_NAME>
.
one more thing: header files
Python.h
should be included.
If you run gcc example.c -o examle
in the command line, you will get a fatal error saying that Python.h
is not found. This is because the .h
file is not in the default search path of clang. You should look for the right version of Python.h
. For me, it is /usr/include/python2.7
. But by using the setup.py
method, we do not have to worry about it.
Glue C Source Code to Python
The Python package distutils
provides us with a great way to setup the Python module with nearly no effort.
Continuing the example above, we can create a file setup.py
in the same directory:
from distutils.core import setup, Extension
module = Extension('diff_calculate',sources=['example.c'])
setup(
ext_modules = [module]
)
Note that the first arguments of Extension
class must be the same as the package name. You can also add name=...,version=...
arguments to setup(...)
as you like. Then in the command line, run python setup.py build
. You will obtain a build
directory, in which you can find diff_calculate.so
-- this is the package you can directly put into use.
Use the package
To use the package, import
will do:
>>> import diff_calculate
>>> diff_calculate.value_at_one(100)
2.7319990264290435
>>> diff_calculate.value_at_two(100)
7.540366073866224
Summary
Writing C extension for Python is easy. In practice, to rewrite the pure C code to a Python extension, we can just modify an example code because most of the steps are very similar. A great tutorial on Python C extension is hosted on tutorialspoint.
Enjoy Python and use Python to do scientific computing!