Interfacing C++11 Array with Cython

Interfacing C++11 array with Cython

As discussed in the comments, the issue you're having is because Cython doesn't really support non-type template arguments. A workround (hacky and probably fragile) is to trick Cython into thinking it's providing a type template argument:

cdef extern from "<array>" namespace "std" nogil :
    cdef cppclass two "2":
        pass

    cdef cppclass array[T, U]:
        T& operator[](size_t)
        # this is obviously very very cut down

def f():
    cdef array[int,two] x
    return x[0]+x[1]

The trick is that if you do cdef cppclass name "somestring" Cython will just blindly replace somestring for name, which generates the correct C++ code. There are obviously some limitations with this approach but for simple usage it should be fine.

Wrapping std::array in Cython and Exposing it to memory views

After much fiddling, I found the answer to my question.

Definition of array and class that uses array:

cdef extern from "<array>" namespace "std" nogil:
  cdef cppclass array4 "std::array<int, 4>":
    array4() except+
    int& operator[](size_t)

cdef extern from "Rectangle.h" namespace "shapes":
  cdef cppclass ArrayFun:
    ArrayFun(array4&)
    array4 getArray()

Python implementation

cdef class PyArrayFun:
    cdef ArrayFun *thisptr      # hold a C++ instance which we're wrapping
    def __cinit__(self, int[:] mem):
      #
      # Conversion from memoryview to std::array<int,4>
      #
      cdef array4 *arr = <array4 *>(&mem[0])
      self.thisptr = new ArrayFun(arr[0])

    def getArray(self):
      cdef array4 arr = self.thisptr.getArray()
      #
      # Conversion from std::array<int, 4> to memoryview
      #
      cdef int[::1] mview = <int[:4]>(&arr[0])
      cdef int[::1] new_view = mview.copy()
      for i in range(0,4):
        print ("arr is ", arr[i])
        print("new_view is ", new_view[i])

      # A COPY MUST be returned because arr goes out of scope and is
      # default destructed when this function exist. Therefore we have to 
      # copy again. This kinda of sucks because we have to copy the 
      # internal array out from C++, and then we have to copy the array
      # out from Python, therefore 2 copies for one array. 
      return mview.copy()

Cython to interface between python and c-library with unknown size char array

You have two basic options:

Work out how to calculate the size in advance.

 size = calculateSize(...)  # for example, by pre-reading the file
 line_output = <char*>malloc(size)
 return_val = readWrapper(options, line_output)

Have readWrapper be responsible for allocating memory. There's two commonly-used patterns in C:

a. return a pointer (perhaps using NULL to indicate an error):

char* readWrapper(options opt)

b. Pass a pointer-to-a-pointer and change it

// C 
int readWrapper(options opt, char** str_out) {
    // work out the length
    *str_out = malloc(length);
    // etc
}

# Cython
char* line_out
return_value = readWrapper(options, &line_out)

You need to ensure that all the strings you allocate are cleaned up. You still have a memory leak for options.filename. For options.filename you're probably better off just getting a pointer to the contents of file through Cython. This is valid as long as file exists so no allocation is needed on your part

options.filename = file

Just make sure that options doesn't outlive file (i.e. it doesn't get stored for later use anywhere in C) .

In general

something = malloc(...)
try:
    # code
finally:
    free(something)

is a good pattern for ensuring clean-up.

Handling C++ arrays in Cython (with numpy and pytorch)

I can think of three sensible ways of doing it. I'll outline them below (i.e. none of the code will be complete but hopefully it will be clear how to finish it).

1. C++ owns the memory; Cython/Python holds a shared pointer to the C++ class

(This is looks to be the lines you're already thinking along).

Start by creating a Cython class that holds a shared pointer

from libcpp.memory cimport shared_ptr

cdef class Holder:
    cdef shared_ptr[cpp_class] ptr

    @staticmethod
    cdef make_holder(shared_ptr[cpp_class] ptr):
       cdef holder = Holder() # empty class
       holder.ptr = ptr
       return holder

You then need to define the buffer protocol for Holder. This allows direct access to the memory allocated by cpp_class in a way that both numpy arrays and Cython memoryviews can understand. Thus they hold a reference to a Holder instance, which in turn keeps a cpp_class alive. (Use np.asarray(holder_instance) to create a numpy array that uses the instance's memory)

The buffer protocol is a little involved but Cython has fairly extensive documentation and you should largely be able to copy and paste their examples. The two methods you need to add to Holder are __getbuffer__ and __releasebuffer__.

2. Python owns the memory; Your C++ class holds a pointer to the Python object

In this version you allocate the memory as a numpy array (using the Python C API interface). When your C++ class is destructed in decrements the reference count of the array, however if Python holds references to that array then the array can outlive the C++ class.

#include <numpy/arrayobject.h>
#include <Python.h>

class cpp_class {
   private:
     PyObject* arr;
     double* data;
   public:
     cpp_class() {
       arr = PyArray_SimpleNew(...); // details left to be filled in
       data = PyArray_DATA(reinterpret_cast<PyArrayObject*>(arr));
       # fill in the data
     }

     ~cpp_class() {
         Py_DECREF(arr); // release our reference to it
     }

     PyObject* get_np_array() {
         Py_INCREF(arr); // Cython expects this to be done before it receives a PyObject
         return arr;
     }
};

See the numpy documentation for details of the how to allocate numpy arrays from C/C++. Be careful of reference counting if you define copy/move constructors.

The Cython wrapper then looks like:

cdef extern from "some_header.hpp":
    cdef cppclass cpp_class:
       # whatever constructors you want to allow
       object get_np_array()

3. C++ transfers ownership of the data to Python/Cython

In this scheme C++ allocates the array, but Cython/Python is responsible for deallocating it. Once ownership is transferred C++ no longer has access to the data.

class cpp_class {
   public:
     double* data; // for simplicity this is public - you may want to use accessors
     cpp_class() :
     data(new double[50])
     {/* fill the array as needed */}

     ~cpp_class() {
       delete [] data;
     }
};

// helper function for Cython
inline void del_cpp_array(double* a) {
   delete [] a;
}

You then use the cython.view.array class to capture the allocated memory. This has a callback function which is used on destruction:

from cython cimport view

cdef extern from "some_header.hpp":
   cdef cppclass cpp_class:
      double* data
      # whatever constructors and other functions
   void del_cpp_array(double*)

# later
cdef cpp_class cpp_instance # create this however you like
# ...
# modify line below to match your data
arr = view.array(shape=(10, 2), itemsize=sizeof(double), format="d",
                 mode="C", allocate_buffer=False)
arr.data = <char*>cpp_instance.data
cpp_instance.data = None # reset to NULL pointer
arr.callback_free_data = del_cpp_array

arr can then be used with a memoryview or a numpy array.

You may have to mess about a bit with casting from void* or char* with del_cpp_array - I'm not sure exactly what types the Cython interface requires.

The first option is probably most work to implement but requires few changes to the C++ code. The second option may require changes to your C++ code that you don't want to make. The third option is simple but means that C++ no longer has access to the data, which might be a disadvantage.

Canonical way to convert an array of strings in C to a Python list using Cython

To complete the answer of @alexis, in term of performance, using append is quite slow (because it use a growing array internally) and it can be replaced by direct indexing. The idea is to perform two walk to know the number of strings. While a two walks seems expensive, this should not be the case since compiler should optimize this loop. If the code is compiled with the highest optimization level (-O3), the first loop should use very fast SIMD instructions. Once the length is known, the list can be allocated/filled in a much faster way. String decoding should take a significant part of the time. UTF-8 decoding is used by default. This is a bit expensive and using ASCII decoding instead should be a bit faster assuming the strings are known not to contain special characters.

Here is an example of untested code:

from cython.operator import dereference

def results_from_c():
    cdef char** cstringsptr = my_c_function()
    cdef int length = 0
    cdef int i

    string = dereference(cstringsptr)
    while string != NULL:
        cstringsptr += 1
        length += 1
        string = dereference(cstringsptr)

    cstringsptr -= length

    # None is just a null pointer so that this just allocates a 0-filled array
    strings = [None] * length

    for i in range(length):
        string = dereference(cstringsptr + i)
        strings[i] = string.decode()

    return strings

This makes the code more complex though.

Cython interfaced with C++: segmentation fault for large arrays

The memory is being managed by your numpy arrays. As soon as they go out of scope (most likely at the end of the PySparse constructor) the arrays cease to exist, and all your pointers are invalid. This applies to both large and small arrays, but presumably you just get lucky with small arrays.

You need to hold a reference to all the numpy arrays you use for the lifetime of your PySparse object:

cdef class PySparse:

  # ----------------------------------------------------------------------------

  cdef Sparse *ptr
  cdef object _held_reference # added

  # ----------------------------------------------------------------------------

  def __cinit__(self,**kwargs):
      # ....
      # your constructor code code goes here, unchanged...
      # ....

      self._held_reference = [data] # add any other numpy arrays you use to this list

As a rule you need to be thinking quite hard about who owns what whenever you're dealing with C/C++ pointers, which is a big change from the normal Python approach. Getting a pointer from a numpy array does not copy the data and it does not give numpy any indication that you're still using the data.

Edit note: In my original version I tried to use locals() as a quick way of gathering a collection of all the arrays I wanted to keep. Unfortunately, that doesn't seem to include to cdefed arrays so it didn't manage to keep the ones you were actually using (note here that astype() makes a copy unless you tell it otherwise, so you need to hold the reference to the copy, rather than the original passed in as an argument).

Passing and returning numpy arrays to C++ methods via Cython

You've basically got it right. First, hopefully optimization shouldn't be a big deal. Ideally, most of the time is spent inside your C++ kernel, not in the cythnon wrapper code.

There are a few stylistic changes you can make that will simplify your code. (1) Reshaping between 1D and 2D arrays is not necessary. When you know the memory layout of your data (C-order vs. fortran order, striding, etc), you can see the array as just a chunk of memory that you're going to index yourself in C++, so numpy's ndim doesn't matter on the C++ side -- it's just seeing that pointer. (2) Using cython's address-of operator &, you can get the pointer to the start of the array in a little cleaner way -- no explicit cast necessary -- using &X[0,0].

So this is my edited version of your original snippet:

cimport numpy as np
import numpy as np

cdef extern from "myclass.h":
    cdef cppclass MyClass:
        MyClass() except +
        void run(double* X, int N, int D, double* Y)

def run(np.ndarray[np.double_t, ndim=2] X):
    X = np.ascontiguousarray(X)
    cdef np.ndarray[np.double_t, ndim=2, mode="c"] Y = np.zeros_like(X)

    cdef MyClass myclass
    myclass = MyClass()
    myclass.run(&X[0,0], X.shape[0], X.shape[1], &Y[0,0])

    return Y

Interfacing C++11 Array with Cython