Several ways of mixing python with C

Keywords: Python C Big Data Linux

Python has been in the limelight these years, occupying a lot of positions in many fields. The Web, big data, artificial intelligence, operation and maintenance all have its own image, even the graphical interface is doing very well, and even when the word full-stack came out, it seems to be to describe it.

Although Python has GIL problems that make it impossible for multithreading to make full use of multicores, later multiprocesses can make use of multicores from the perspective of multiprocesses, and even affinity can bind specific CPU cores. This problem has been solved. Although the basic stack language, but sometimes for efficiency, may still consider mixing with C language. Mixed editing is an unavoidable topic in computer. It involves a lot of things. Technology, architecture, team situation, management, customers and other aspects may have an impact on it. I would like to open a special discussion on mixed editing at that time. This article only talks about the way Python and C mix up, there are roughly the following ways (the background of this article is linux, other platforms can be analogous):

  

Shared Libraries

The shared libraries are compiled in C language, and python uses cdll in the ctype library to open the shared libraries.

For example, the C language code is

/* func.c */
int
func(int a) { return a*a; }

The python code is

#!/usr/bin/env python
#test_so.py
from ctypes import cdll import os p = os.getcwd() + '/libfunc.so' f = cdll.LoadLibrary(p) print f.func(99)

The tests are as follows

$ gcc -fPIC -shared func.c -o libfunc.so
$ ./test_so.py
9801

  

  subprocess

C language designs a complete executable file, and python executes the executable file through subprocess, which is essentially fork+execve.

For example, the C language code is

/* test.c */
#include <stdio.h>
int func(int a)
{
        return a*a;
}

int main(int argc, char **argv)
{
        int x;

        sscanf(argv[1], "%d", &x);
        printf("%d\n", func(x));
        return 0;
}

The Python code is

#!/usr/bin/env python
# test_subprocess.py
import os
import subprocess

subprocess.call([os.getcwd()+'/a.out', '99'])

The tests are as follows

$ gcc test.c -o a.out
$ ./test_subprocess.py
9801

  

Running python program in C language

C language using popen/system or directly using system call level fork+exec to run python program is also a means of mixing.

For example, the Python code is as follows

#!/usr/bin/env python
# test.py
import sys
x = int(sys.argv[1])
print x*x

The C language code is as follows

/* test.c */
#include <stdio.h>
#include <stdlib.h>
int main()
{
        FILE *f;
        char s[1024];
        int ret;

        f = popen("./test.py 99", "r");
        while((ret=fread(s,1,1024,f))>0) {
                fwrite(s,1,ret,stdout);
        }
        fclose(f);
        return 0;
}

The tests are as follows

$ gcc test.c
$ ./a.out
9801

  

python's support for C language extension

Many programming languages have added support for C language extensions for two reasons: (1) at the beginning of language design, we can make full use of the existing libraries of C language to do many extensions; (2) the efficiency of C language is high.

Python is no exception. Since the day of its birth, many libraries have been written in C. Python's C language extension involves the correspondence between python's data structure and C language. In fact, the extension method is to write a shared library in C language, but the interface in the shared library is a standard and can be recognized by python.

To illustrate how to extend, let me first assume a function function function under python with the following code

def func(*a):
    res=1
    for i in range(len(a)):
        res *= sum(a[i])
    return res

As mentioned above, the desired function is that the parameter is a list of arbitrary numbers (excluding other data structures), returning the product of the sum of the elements of each list.

Let's write the python code first, as follows

#!/usr/bin/env python
# test.py
import colin

def func(*a):
    res=1
    for i in range(len(a)):
        res *= sum(a[i])
    return res

a = [1,2,3]
b = [4,5,6]
c = [7,8]
d = [9]
e = [10,11,12,13,14]

f = colin.func2(99)
g = colin.func3(a,b,c,d,e)
h = func3(a,b,c,d,e)
print "f = ",f
print "g = ",g
print "h = ",h

With the square func, which has been tested before, this implementation is relatively simple. I hope that the func written by python can be consistent with the results extended by C language.

Firstly, the realization of these functions is written in C language, in which func3 uses a data structure y_t to represent any number of arrays of arbitrary length, and x_t to represent a single array.

/* colin.h */
#ifndef Colin_h
#define Colin_h
typedef struct {
        int *a;
        int len;
} x_t;
typedef struct {
        x_t *ax;
        int len;
} y_t;
int func2(int a);
int func3(y_t *p);
void free_y_t(y_t *p);
#endif

  

/* colin.c */
#include "colin.h"
#include <stdlib.h>

int func2(int a)
{
        return a*a;
}

int func3(y_t *p)
{
        int result;
        int sum;
        int i, j;

        result = 1;
        for(i=0;i<p->len;i++) {
                sum = 0;
                for(j=0;j<p->ax[i].len;j++)
                        sum += p->ax[i].a[j];
                result *= sum;
        }

        return result;
}

void free_y_t(y_t *p)
{
        int i;
        for(i=0;i<p->len;i++) {
                free(p->ax[i].a);
        }
        free(p->ax);
}

There are three functions defined above, func2 represents square, func3 represents functions mentioned before, and because y_t structure may be dynamically allocated, a method of returning memory is given.

As we said just now about Python extensions, we need to "standardize" the interface of this shared library. So we wrapped it and gave it a python loading entry.

/* wrap.c */
#include <Python.h>
#include <stdlib.h>
#include "colin.h"
PyObject* wrap_func2(PyObject* self, PyObject* args)
{
        int n, result;
        /* Derive an integer from the parameter list, using "i" */
        if (!PyArg_ParseTuple(args, "i", &n))
                return NULL;

        /* Computing with C Language Library Implementation */
        result = func2(n);
        /* The calculation results must be derived into python recognition types */
        return Py_BuildValue("i", result);
}

PyObject* wrap_func3(PyObject* self, PyObject* args)
{
        int n, result;
        int i, j;
        int size, size2;
        PyObject *p,*q;
        y_t *y;

        y = malloc(sizeof(y_t));
        /* How many parameters are there in the first number, that is, the number of lists? */
        size = PyTuple_Size(args);
        /* Allocate the number of arrays first */
        y->len = size;
        y->ax = malloc(sizeof(x_t)*size);
        /* Traversing through lists (parameters) in python */
        for(i=0;i<size;i++) {
                /* Get the first parameter, which is a list. */
                p = PyTuple_GetItem(args, i);
                /* Get the length of the list */
                size2 = PyList_Size(p);
                /* Allocate space for groups */
                y->ax[i].len = size2;
                y->ax[i].a = malloc(sizeof(int)*size2);
                /* Traverse the list and move the number from the list to the array in turn */
                for(j=0;j<size2;j++) {
                        q = PyList_GetItem(p, j);
                        PyArg_Parse(q,"i",&y->ax[i].a[j]);
                }
        }

        /* Computing with C Language Library Implementation */
        result = func3(y);
        free_y_t(y);
        free(y);
        /* The results are converted to python recognition format */
        return Py_BuildValue("i", result);
}

/* This is the list of interfaces. When loaded, only the address of this list is loaded, so the data structure can't be placed in the stack (local variables) and will be cleared. */
static PyMethodDef colinMethods[] =
{
        {"func2", wrap_func2, METH_VARARGS, "Just a test"},
        {"func3", wrap_func3, METH_VARARGS, "Just a test"},
        {NULL, NULL, METH_NOARGS, NULL}
};

/* python Load-time interface */
/* Note that since the library name is colin, this function must pass initcolin */
void initcolin()
{
        PyObject *m;
        m = Py_InitModule("colin", colinMethods);
}

In the process, I guess PyArg_VaParse should be more powerful, but repeated testing failed, nor did I look at the documentation.

Testing

$ gcc -I /usr/include/python2.7/ -fPIC -shared colin.c wrap.c -o colin.so
$ ./test.py
f =  9801
g =  729000
h =  729000

As you can see, functions written in C language and functions written in python have the same results.

Posted by Xager on Tue, 21 May 2019 12:06:04 -0700