Numpy: Formal Definition of "Array_Like" Objects

numpy: formal definition of array_like objects?

It turns out almost anything is technically an array-like. "Array-like" is more of a statement of how the input will be interpreted than a restriction on what the input can be; if a parameter is documented as array-like, NumPy will try to interpret it as an array.

There is no formal definition of array-like beyond the nearly tautological one -- an array-like is any Python object that np.array can convert to an ndarray. To go beyond this, you'd need to study the source code.

NPY_NO_EXPORT PyObject *
PyArray_FromAny(PyObject *op, PyArray_Descr *newtype, int min_depth,
int max_depth, int flags, PyObject *context)
{
/*
* This is the main code to make a NumPy array from a Python
* Object. It is called from many different places.
*/
PyArrayObject *arr = NULL, *ret;
PyArray_Descr *dtype = NULL;
int ndim = 0;
npy_intp dims[NPY_MAXDIMS];

/* Get either the array or its parameters if it isn't an array */
if (PyArray_GetArrayParamsFromObject(op, newtype,
0, &dtype,
&ndim, dims, &arr, context) < 0) {
Py_XDECREF(newtype);
return NULL;
}
...

Particularly interesting is PyArray_GetArrayParamsFromObject, whose comments enumerate the types of objects np.array expects:

NPY_NO_EXPORT int
PyArray_GetArrayParamsFromObject(PyObject *op,
PyArray_Descr *requested_dtype,
npy_bool writeable,
PyArray_Descr **out_dtype,
int *out_ndim, npy_intp *out_dims,
PyArrayObject **out_arr, PyObject *context)
{
PyObject *tmp;

/* If op is an array */

/* If op is a NumPy scalar */

/* If op is a Python scalar */

/* If op supports the PEP 3118 buffer interface */

/* If op supports the __array_struct__ or __array_interface__ interface */

/*
* If op supplies the __array__ function.
* The documentation says this should produce a copy, so
* we skip this method if writeable is true, because the intent
* of writeable is to modify the operand.
* XXX: If the implementation is wrong, and/or if actual
* usage requires this behave differently,
* this should be changed!
*/

/* Try to treat op as a list of lists */

/* Anything can be viewed as an object, unless it needs to be writeable */

}

So by studying the source code we can conclude an array-like is

  • a NumPy array, or
  • a NumPy scalar, or
  • a Python scalar, or
  • any object which supports the PEP 3118 buffer interface, or
  • any object that supports the __array_struct__ or __array_interface__ interface, or
  • any object that supplies the __array__ function, or
  • any object that can be treated as a list of lists, or
  • anything! If it doesn't fall under one of the other cases, it'll be treated as a 0-dimensional array of object dtype.

Terminology: Python and Numpy - `iterable` versus `array_like`

The term "array-like" is indeed only used in NumPy and refers to anything that can be passed as first parameter to numpy.array() to create an array.

The term "iterable" is standard python terminology and refers to anything that can be iterated over (for example using for x in iterable).

Most array-like objects are iterable, with the exception of scalar types.

Many iterables are not array-like -- for example you can't construct a NumPy array from a generator expression using numpy.array(). (You would have to use numpy.fromiter() instead. Nonetheless, a generator expression isn't an "array-like" in the terminology of the NumPy documentation.)

Why numpy can save and load objects different than numpy arrays

numpy.save() documents its argument as "array-like".

As per numpy: formal definition of "array_like" objects?, the underlying numpy/core/src/multiarray/ctors.c:PyArray_FromAny() accepts:

/* op is an array */

/* op is a NumPy scalar */

/* op is a Python scalar */

/* op supports the PEP 3118 buffer interface */

/* op supports the __array_struct__ or __array_interface__ interface */

/* op supplies the __array__ function. */

/* Try to treat op as a list of lists */

Specifically for dict, the execution path goes like this:

numpy/npyio.py ->
numpy/core/numeric.py:asanyarray() ->
numpy/core/src/multiarray/multiarraymodule.c:_array_fromobject() ->
numpy/core/src/multiarray/ctors.c:PyArray_CheckFromAny() ->
the aforementioned PyArray_FromAny. There:

<...>
PyArray_GetArrayParamsFromObject(op, newtype,
0, &dtype,
&ndim, dims, &arr, context)
<...>
else {
if (newtype == NULL) {
newtype = dtype; #object dtype
<...>
ret = (PyArrayObject *)PyArray_NewFromDescr(&PyArray_Type, newtype,
ndim, dims,
NULL, NULL,
flags&NPY_ARRAY_F_CONTIGUOUS, NULL);
return (PyObject *)ret;

Why is this numpy array still a generator?

Native Python max, and many other functions, take an iterable as their parameter, meaning an object that has either a __getitem__ method or an __iter__ method which returns an iterator (source). This includes lists, tuples and generators.

numpy functions such as np.amax take a different approach, usually accepting an "array-like" argument. Generally, these have __getitem__ and also have a defined length. Generators do not have a defined length. If it gets something else, such as a generator, it treats it as a 0-dimensional array containing only that object. The maximum of such an array is simply the generator object it contains.

This is why m in your code is a generator rather than a number, and why value/m raises an error.

As already answered, you can get around this using a list comprehension. You could also use np.fromiter.

Creating an array_like QImage subclass for numpy.array()

It's easier to just use the buffer object returned from QImage.bits() and np.frombuffer().

def qimage2array(q_image):
width = q_image.width()
height = q_image.height()
arr = np.frombuffer(q_image.bits(), dtype=np.uint8).reshape([height, width, -1])
return arr


Related Topics



Leave a reply



Submit