What are the differences between numpy arrays and matrices? Which one should I use?
As per the official documents, it's not anymore advisable to use matrix class since it will be removed in the future.
https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
As other answers already state that you can achieve all the operations with NumPy arrays.
numpy np.array versus np.matrix (performance)
I added some more tests, and it appears that an array
is considerably faster than matrix
when array/matrices are small, but the difference gets smaller for larger data structures:
Small (4x4):
In [11]: a = [[1,2,3,4],[5,6,7,8]]
In [12]: aa = np.array(a)
In [13]: ma = np.matrix(a)
In [14]: %timeit aa.sum()
1000000 loops, best of 3: 1.77 us per loop
In [15]: %timeit ma.sum()
100000 loops, best of 3: 15.1 us per loop
In [16]: %timeit np.dot(aa, aa.T)
1000000 loops, best of 3: 1.72 us per loop
In [17]: %timeit ma * ma.T
100000 loops, best of 3: 7.46 us per loop
Larger (100x100):
In [19]: aa = np.arange(10000).reshape(100,100)
In [20]: ma = np.matrix(aa)
In [21]: %timeit aa.sum()
100000 loops, best of 3: 9.18 us per loop
In [22]: %timeit ma.sum()
10000 loops, best of 3: 22.9 us per loop
In [23]: %timeit np.dot(aa, aa.T)
1000 loops, best of 3: 1.26 ms per loop
In [24]: %timeit ma * ma.T
1000 loops, best of 3: 1.24 ms per loop
Notice that matrices are actually slightly faster for multiplication.
I believe that what I am getting here is consistent with what @Jaime is explaining the comment.
Difference between array and matrix numpy for solving linear equations
You have a transpose issue...when you go to matrix land, column-vectors and row-vectors are no longer interchangeable:
import numpy as np
A = np.array([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
b = np.array([5,-1,3])
x = np.linalg.solve(A, b)
print 'arrays:'
print x
A = np.matrix(A)
b = np.matrix(b)
x = np.linalg.solve(A, b)
print 'matrix, wrong set up:'
print x
b = b.T
x = np.linalg.solve(A, b)
print 'matrix, right set up:'
print x
yields:
arrays:
[ 1. 2. 3.]
matrix, wrong set up:
[[ 5. -1. 3.]
[ 10. -2. 6.]
[ 5. -1. 3.]]
matrix, right set up:
[[ 1.]
[ 2.]
[ 3.]]
Numpy calculate difference of matrices against all rows in matrix
The numpy solution suggested by blorgon is likely faster, but you can also use scipy.spatial.distance.cdist
:
>>> from scipy.spatial.distance import cdist
>>> cdist(a, b)**2
array([[ 18.29 , 112.45 , 308.6765],
[ 7.49 , 79.65 , 251.0165]])
The problem with this approach is that it takes a square root and then undoes it. The advantage is that it does not use a large intermediate array. You can avoid some intermediates in numpy like this:
>>> diff = b - a[:, np.newaxis]
>>> np.power(diff, 2, out=diff).sum(axis=2)
array([[ 18.29 , 112.45 , 308.6765],
[ 7.49 , 79.65 , 251.0165]])
What is the difference between using matrix multiplication with np.matrix arrays, and dot()/tensor() with np.arrays?
The main advantage of working with matrices is that the *
symbol performs a matrix multiplication, whereas it performs an element-wise multiplications with arrays. With arrays you need to use dot
. See:
http://wiki.scipy.org/NumPy_for_Matlab_Users
What are the differences between numpy arrays and matrices? Which one should I use?
If m
is a one dimensional array, you don't need to transpose anything, because for 1D arrays, transpose doesn't change anything:
In [28]: m.T.shape, m.shape
Out[28]: ((3,), (3,))
In [29]: m.dot(C)
Out[29]: array([15, 18, 21])
In [30]: C.dot(m)
Out[30]: array([ 5, 14, 23])
This is different if you add another dimension to m
:
In [31]: mm = m[:, np.newaxis]
In [32]: mm.dot(C)
--------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-32-28253c9b8898> in <module>()
----> 1 mm.dot(C)
ValueError: objects are not aligned
In [33]: (mm.T).dot(C)
Out[33]: array([[15, 18, 21]])
In [34]: C.dot(mm)
Out[34]:
array([[ 5],
[14],
[23]])
how does multiplication differ for NumPy Matrix vs Array classes?
In 3.5, Python finally got a matrix multiplication operator. The syntax is a @ b
.
Related Topics
Request Uac Elevation from Within a Python Script
Pandas Get Topmost N Records Within Each Group
Lxml Error "Ioerror: Error Reading File" When Parsing Facebook Mobile in a Python Scraper Script
How to Send Http Requests to Flask Server
Python Library for Linux Process Management
Differencebetween Size and Count in Pandas
Python Multiprocessing + Subprocess Issues
Unicode Box Drawing Characters Not Printed in Ruby
Gdb Pretty Printing with Python a Recursive Structure
What Are the Differences Between the Urllib, Urllib2, Urllib3 and Requests Module
Checking Running Python Script Within the Python Script
To Read Line from File Without Getting "\N" Appended at the End
How to Generate Dynamic (Parameterized) Unit Tests in Python
Handling Spreadsheet Data Through the Clipboard in Gtk
Usb Automatic Detection in Python for Linux Env
Tensorflow Install Fails with "Compiletime Version 3.5 of Module Does Not Match Runtime Version 3.6"