Force Numpy Ndarray to Take Ownership of Its Memory in Cython

Force NumPy ndarray to take ownership of its memory in Cython

You just have some minor errors in the interface definition. The following worked for me:

from libc.stdlib cimport malloc
import numpy as np
cimport numpy as np

np.import_array()

ctypedef np.int32_t DTYPE_t

cdef extern from "numpy/arrayobject.h":
    void PyArray_ENABLEFLAGS(np.ndarray arr, int flags)

cdef data_to_numpy_array_with_spec(void * ptr, np.npy_intp N, int t):
    cdef np.ndarray[DTYPE_t, ndim=1] arr = np.PyArray_SimpleNewFromData(1, &N, t, ptr)
    PyArray_ENABLEFLAGS(arr, np.NPY_OWNDATA)
    return arr

def test():
    N = 1000

    cdef DTYPE_t *data = <DTYPE_t *>malloc(N * sizeof(DTYPE_t))
    arr = data_to_numpy_array_with_spec(data, N, np.NPY_INT32)
    return arr

This is my setup.py file:

from distutils.core import setup, Extension
from Cython.Distutils import build_ext
ext_modules = [Extension("_owndata", ["owndata.pyx"])]
setup(cmdclass={'build_ext': build_ext}, ext_modules=ext_modules)

Build with python setup.py build_ext --inplace. Then verify that the data is actually owned:

import _owndata
arr = _owndata.test()
print arr.flags

Among others, you should see OWNDATA : True.

And yes, this is definitely the right way to deal with this, since numpy.pxd does exactly the same thing to export all the other functions to Cython.

cython: create ndarray object without allocating memory for data

Efficiency aside, does this sort of assignment compile?

np.empty does not zero fill. np.zeros does that, and even that is done 'on the fly'.

Why the performance difference between numpy.zeros and numpy.zeros_like? explores how empty, zeros and zeros_like are implemented.

I'm just a beginner with cython, but I have to use:

tmp_buffer.data = <char *>my_buffer

How about going the other way, making my_buffer the allocated data of tmp_buffer?

array1 = np.empty(bsize, dtype=int)
cdef int *data
data = <int *> array1.data
for i in range(bsize):
    data[i] = bsize-data[i]

http://gael-varoquaux.info/programming/cython-example-of-exposing-c-computed-arrays-in-python-without-data-copies.html
suggests using np.PyArray_SimpleNewFromData to create an array from an existing data buffer.

Regarding memoryviews
http://docs.cython.org/src/userguide/memoryviews.html

cython: memory view of ndarray of strings (or direct ndarray indexing)

The issue is that numpy array dtypes have to have a fixed size. When you make an array of "strings" you're actually making an array of fixed length char arrays. Try this:

import numpy as np

array = np.array(["cat", "in", "a", "hat"])
array[2] = "Seuss"
print(array)
# ['cat' 'in' 'Seu' 'hat']
print(array.dtype)
# dtype('|S3')
print(array.dtype.itemsize)
# 3

With that in mind, you could something like this:

cdef void abc(char[:, ::1] in_buffer):
    cdef char * element
    element = address(in_buffer[1, 0])

Then when you pass your arrays to abc you'll need to do something like:

a = np.array(['ABC', 'D', 'EFGHI'])
array_view = a.view('uint8').reshape(a.size, a.dtype.itemsize)
abc(array_view)

This is only one approach, but it's the one I would recommend without knowing more about what you're trying to do.

Force Numpy Ndarray to Take Ownership of Its Memory in Cython