How to Handle C++ Return Type Std::Vector<Int> in Python Ctypes

How to handle C++ return type std::vectorint in Python ctypes?

The particular reason is that speed is important. I'm creating an
application that should be able to handle big data. On 200,000 rows
the missings have to be counted on 300 values (200k by 300 matrix). I
believe, but correct me if I'm wrong, that C++ will be significantly
faster.

Well, if you're reading from a large file, your process is going to be mostly IO-bound, so the timings between Python and C probably won't be significantly different.

The following code...

result = []
for line in open('test.txt'):
result.append(line.count('NA'))

...seems to run just as fast as anything I can hack together in C, although it's using some optimized algorithm I'm not really familiar with.

It takes less than a second to process 200,000 lines, although I'd be interested to see if you can write a C function which is significantly faster.


Update

If you want to do it in C, and end up with a Python list, it's probably more efficient to use the Python/C API to build the list yourself, rather than building a std::vector then converting to a Python list later on.

An example which just returns a list of integers from 0 to 99...

// hack.c

#include <python2.7/Python.h>

PyObject* foo(const char* filename)
{
PyObject* result = PyList_New(0);
int i;

for (i = 0; i < 100; ++i)
{
PyList_Append(result, PyInt_FromLong(i));
}

return result;
}

Compiled with...

$ gcc -c hack.c -fPIC
$ ld -o hack.so -shared hack.o -lpython2.7

Example of usage...

>>> from ctypes import *
>>> dll = CDLL('./hack.so')
>>> dll.foo.restype = py_object
>>> dll.foo('foo')
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...]

ctypes return array of strings std::vectorstd::string

The function is returning a char**, and you've told Python that it is returning a char*[3] (an array of 3 char* pointers, not a pointer itself), so the returned value isn't being interpreted properly by ctypes.

Change the return type to ctypes.POINTER(ctypes.c_char_p), or alternatively change your program to return something that has the same size as char*[3], like std::array<char*, 3> or struct otp_data { char *one, *two, *three; }; (which would be 1 less malloc since you can return this by value)

Passing std:vector from C++ to Python via Ctypes: getting nonsensical values

As mentioned in comments, your vector is a local variable and destroyed after return from the function. One way that works is to let Python manage the memory and copy the data into it.

test.cpp

#include <vector>
#include <cstring>

#define API __declspec(dllexport) // Windows-specific export

// Must pass double[4] array...
extern "C" API void returnQ(double* data) {
std::vector<double> v = {7.5, 5.5, 16.5, 8.5};
// Of course, you could write directly to "data" without the vector...
std::memcpy(data,v.data(),v.size() * sizeof v[0]);
}

Usage:

>>> from ctypes import *
>>> dll = CDLL('test')
>>> dll.returnQ.argtypes = POINTER(c_double),
>>> dll.returnQ.restype = None
>>> data = (c_double * 4)() # equivalent to C++ double[4]
>>> dll.returnQ(data)
>>> list(data)
[7.5, 5.5, 16.5, 8.5]

calling c from python with ctypes: passing vectors

Based on @Sven Marnach's answer:

#!/usr/bin/env python
import ctypes
import numpy as np
from numpy.ctypeslib import ndpointer

libf = ctypes.cdll.LoadLibrary('/path/to/lib.so')
libf.f.restype = ctypes.c_double
libf.f.argtypes = [ctypes.c_int, ndpointer(ctypes.c_double)]

def f(a):
return libf.f(a.size, np.ascontiguousarray(a, np.float64))

if __name__=="__main__":
# slice to create non-contiguous array
a = np.arange(1, 7, dtype=np.float64)[::2]
assert not a.flags['C_CONTIGUOUS']
print(a)
print(np.multiply.reduce(a))
print(f(a))

Output

[ 1.  3.  5.]
15.0
15.0

Removing np.ascontiguousarray() call produces the wrong result (6.0 on my machine).

passing vector from a c++ dll in python ctypes

in fact due the extern "C" I believe that this vector turns into a pointer

This is wrong. there is no mechanism that makes this possible by default.
Your BOLHA() function needs to receive a double* and then convert it to vector<double> or just use the raw pointer.

If you want this to work with the vector<double> signature, you needs something to do the work of converting the pointer to a vector. boost::python can do that but that would require that the DLL you're working with would be a python module and not just any DLL.

if you have the function:

extern "C" myDLL_API double BOLHA(vector<double> OI);

you'll need to declare a new function:

extern "C" myDLL_API double BOLHA_raw(double* ptr, int size) {
vector<double> v(ptr, ptr+size);
return BOLHA(v);
}

How to return array from C++ function to Python using ctypes

function.cpp returns an int array, while wrapper.py tries to interpret them as doubles. Change ArrayType to ctypes.c_int * 10 and it should work.


It's probably easier to just use np.ctypeslib instead of frombuffer yourself. This should look something like

import ctypes
from numpy.ctypeslib import ndpointer

lib = ctypes.CDLL('./library.so')
lib.function.restype = ndpointer(dtype=ctypes.c_int, shape=(10,))

res = lib.function()

ctypes - get output from c++ function into python object

@Paul Mu Guire's suggestion above might have helped, but what I needed was something really simple that took a string and output a string. So, the whole object oriented paradigm was overkill. I changed to a simple C structure -

extern "C" {
char* linked(char * in){
return in;
}
}

and it worked quite well after doing the lib.linked.restype = c_char_p.

Import C++ function in Python through ctypes: why segmentation fault?

As mentioned in the chat we had, the code posted in the question segfaults because the following code returns a zero-length vector when first called:

std::vector<int> to_xy(int k, int nside) {
int x = k%nside;
int y = floor(k / nside);
vector<int> res(x, y); // if k==0, this is length 0
return res;
}

then this code faults here:

    for (int h = 0; h < n_matrix; h++){

std::vector<int> xy_vec = to_xy(h, nside); // length 0
int modneg1 = modNegOperator(xy_vec[0] + 1,nside); // xy_vec[0] faults.

The code we discussed in chat didn't fail because:

    vector<int> res(x, y);    // if k==0, this is length 0

had been changed to:

    vector<int> res{x, y};    // curly braces, length 2

After resolving that, the Python code just needed .restype defined, and technically c_uint for the first parameter:

handle.create2Darray.argtypes = ctypes.c_uint, ctypes.c_double, ctypes.c_double
handle.create2Darray.restype = ctypes.POINTER(ctypes.POINTER(ctypes.c_double))

Then the code would return the double** correctly.

DLL conversion with ctypes on byref vector argument in python

POINTER(…) constructs a new pointer type, not a value of that type. So, when you do this:

 mydll.API_GetOrder(POINTER(ApiOrder()))

… you’re passing a Python type object, not a ctypes wrapper around a C pointer object.


To get a pointer to a ctypes wrapper object, you want to call either pointer or byref. The former constructs a POINTER(…) instance, sets it to point to your object, and passes the wrapped pointer; the latter just directly passes a pointer to your object without constructing a pointer wrapper object, and usually that’s all you need. See Passing pointers in the docs for further details.


However, I don’t think this is going to do much good, for two reasons.


First, most functions that take a pointer to some struct and return an int are doing it so they can fill in that struct with useful values. Constructing a new empty struct and passing a pointer to it and not holding onto a reference to it means you have no way to look at whatever values got filled in.

Also, you probably want to check the return value.

In general, you need to do something like this:

order = ApiOrder()
ret = mydll.API_GetOrder(byref(order))
if ret:
do some error handling with either ret or errno
else:
so something with order

While we’re at it, you almost certainly want to set the argtypes and restype of the function, so ctypes knows how to convert things properly, and can give you an exception if you do something that makes no sense, instead of making it guess how to convert and pass things and segfault if it guesses wrong.

Also, for the case of functions that return a success-or-error int, it's usually better to assign a function to the restype, which looks up the error and raises an appropriate exception. (Use an errcheck if you need anything more flexible than just checking that an int return is nonzero or a pointer return is zero.)


But even this isn’t going to help here, because the function you’re trying to call doesn’t take a pointer to an ApiOrder in the first place, it takes a reference to a std::vector of them. So you need to call into the C++ stdlib to construct an object of that type, then you can byref that as the argument.

But usually, it’s easier to write some C++ code that provides a C API to the library, then use ctypes to call that C API, instead of trying to build and use C++ objects from Python.

Your C++ code would look something like this:

int call_getorder(p_API_GetOrder func, ApiOrder *apiOrderArray, size_t apiOrderCount) {
std::vector<ApiOrder> vec(apiOrderArray, apiOrderCount);
ret = func(vec);
if (ret) return ret;
std::copy(std::begin(vec), std::end(vec), apiOrderArray);
return 0;
}

Now, you can call this from Python by creating an array of 1 ApiOrder (or creating a POINTER to an ApiOrder and passing it directly, if you prefer):

orders = (ApiOrder*1)()
ret = mywrapperdll.call_order(mydll.API_GetOrder, byref(order), 1)
if ret:
do some error handling with either ret or errno
else:
do something with order[0]

Of course you're still going to want argtypes and restype.



Related Topics



Leave a reply



Submit