What Are the Differences Between Numpy Arrays and Matrices? Which One Should I Use

What are the differences between numpy arrays and matrices? Which one should I use?

As per the official documents, it's not anymore advisable to use matrix class since it will be removed in the future.


As other answers already state that you can achieve all the operations with NumPy arrays.

numpy np.array versus np.matrix (performance)

I added some more tests, and it appears that an array is considerably faster than matrix when array/matrices are small, but the difference gets smaller for larger data structures:

Small (4x4):

In [11]: a = [[1,2,3,4],[5,6,7,8]]

In [12]: aa = np.array(a)

In [13]: ma = np.matrix(a)

In [14]: %timeit aa.sum()
1000000 loops, best of 3: 1.77 us per loop

In [15]: %timeit ma.sum()
100000 loops, best of 3: 15.1 us per loop

In [16]: %timeit np.dot(aa, aa.T)
1000000 loops, best of 3: 1.72 us per loop

In [17]: %timeit ma * ma.T
100000 loops, best of 3: 7.46 us per loop

Larger (100x100):

In [19]: aa = np.arange(10000).reshape(100,100)

In [20]: ma = np.matrix(aa)

In [21]: %timeit aa.sum()
100000 loops, best of 3: 9.18 us per loop

In [22]: %timeit ma.sum()
10000 loops, best of 3: 22.9 us per loop

In [23]: %timeit np.dot(aa, aa.T)
1000 loops, best of 3: 1.26 ms per loop

In [24]: %timeit ma * ma.T
1000 loops, best of 3: 1.24 ms per loop

Notice that matrices are actually slightly faster for multiplication.

I believe that what I am getting here is consistent with what @Jaime is explaining the comment.

Difference between array and matrix numpy for solving linear equations

You have a transpose issue...when you go to matrix land, column-vectors and row-vectors are no longer interchangeable:

import numpy as np

A = np.array([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
b = np.array([5,-1,3])
x = np.linalg.solve(A, b)
print 'arrays:'
print x

A = np.matrix(A)
b = np.matrix(b)
x = np.linalg.solve(A, b)
print 'matrix, wrong set up:'
print x

b = b.T
x = np.linalg.solve(A, b)
print 'matrix, right set up:'
print x


[ 1. 2. 3.]
matrix, wrong set up:
[[ 5. -1. 3.]
[ 10. -2. 6.]
[ 5. -1. 3.]]
matrix, right set up:
[[ 1.]
[ 2.]
[ 3.]]

Numpy calculate difference of matrices against all rows in matrix

The numpy solution suggested by blorgon is likely faster, but you can also use scipy.spatial.distance.cdist:

>>> from scipy.spatial.distance import cdist
>>> cdist(a, b)**2
array([[ 18.29 , 112.45 , 308.6765],
[ 7.49 , 79.65 , 251.0165]])

The problem with this approach is that it takes a square root and then undoes it. The advantage is that it does not use a large intermediate array. You can avoid some intermediates in numpy like this:

>>> diff = b - a[:, np.newaxis]
>>> np.power(diff, 2, out=diff).sum(axis=2)
array([[ 18.29 , 112.45 , 308.6765],
[ 7.49 , 79.65 , 251.0165]])

What is the difference between using matrix multiplication with np.matrix arrays, and dot()/tensor() with np.arrays?

The main advantage of working with matrices is that the *symbol performs a matrix multiplication, whereas it performs an element-wise multiplications with arrays. With arrays you need to use dot. See:
What are the differences between numpy arrays and matrices? Which one should I use?

If m is a one dimensional array, you don't need to transpose anything, because for 1D arrays, transpose doesn't change anything:

In [28]: m.T.shape, m.shape
Out[28]: ((3,), (3,))
In [29]: m.dot(C)
Out[29]: array([15, 18, 21])

In [30]: C.dot(m)
Out[30]: array([ 5, 14, 23])

This is different if you add another dimension to m:

In [31]: mm = m[:, np.newaxis]

In [32]: mm.dot(C)
ValueError Traceback (most recent call last)
<ipython-input-32-28253c9b8898> in <module>()
----> 1 mm.dot(C)

ValueError: objects are not aligned

In [33]: (mm.T).dot(C)
Out[33]: array([[15, 18, 21]])

In [34]: C.dot(mm)
array([[ 5],

how does multiplication differ for NumPy Matrix vs Array classes?

In 3.5, Python finally got a matrix multiplication operator. The syntax is a @ b.

Related Topics

Leave a reply
