Improving speed of the code when using numpy.apply_along_axis
np.sin
and *
are vectorized operations, so, you can apply them over whole arrays:
np.sin(data[:, 0]) * np.cos(data[:, 1])
data[:, 0]
is the first column and data[:, 1]
is the second.
Note that this should go really fast :)
Here is a notebook that tests the speed of each method: notebook.
Average run time:
- Method 1 (using
numpy.apply_along_axis
): 2.08s - Method 2 (loop applying function to rows): 1.14s
- Method 3 (this answer): 17.3ms
numpy apply_along_axis vectorisation
Here's one vectorized approach setting the zeros
as NaN
and that let's us use np.nanmax
and np.nanstd
to compute those max
and std
values avoiding the zeros
, like so -
imgn = np.where(img==0, np.nan, img)
mx = np.nanmax(imgn,0) # np.max(img,0) if all are positive numbers
st = np.nanstd(imgn,0)
mask = img > mx - 1.5*st
out = np.arange(mask.shape[0]).dot(mask)/mask.sum(0)
Runtime test -
In [94]: img = np.random.randint(-100,100,(2000,50))
In [95]: %timeit np.apply_along_axis(get_y, 0, img)
100 loops, best of 3: 4.36 ms per loop
In [96]: %%timeit
...: imgn = np.where(img==0, np.nan, img)
...: mx = np.nanmax(imgn,0)
...: st = np.nanstd(imgn,0)
...: mask = img > mx - 1.5*st
...: out = np.arange(mask.shape[0]).dot(mask)/mask.sum(0)
1000 loops, best of 3: 1.33 ms per loop
Thus, we are seeing a 3x+
speedup.
Why does numpy.apply_along_axis seem to be slower than Python loop?
np.sum
take an axis
parameter, so you could compute the sum simply using
sums3 = np.sum(x, axis=1)
This is much faster than the 2 methods you posed.
$ python -m timeit -n 1 -r 1 -s "import numpy as np;x=np.ones([100000,3])" "np.apply_along_axis(np.sum, 1, x)"
1 loops, best of 1: 3.21 sec per loop
$ python -m timeit -n 1 -r 1 -s "import numpy as np;x=np.ones([100000,3])" "np.array([np.sum(x[i,:]) for i in range(x.shape[0])])"
1 loops, best of 1: 712 msec per loop
$ python -m timeit -n 1 -r 1 -s "import numpy as np;x=np.ones([100000,3])" "np.sum(x, axis=1)"
1 loops, best of 1: 1.81 msec per loop
(As for why apply_along_axis
is slower — I don't know, probably because the function is written in pure Python and is much more generic and thus less optimization opportunity than the array version.)
Easy parallelization of numpy.apply_along_axis()?
Alright, I worked it out: an idea is to use the standard multiprocessing
module and split the original array in just a few chunks (so as to limit communication overhead with the workers). This can be done relatively easily as follows:
import multiprocessing
import numpy as np
def parallel_apply_along_axis(func1d, axis, arr, *args, **kwargs):
"""
Like numpy.apply_along_axis(), but takes advantage of multiple
cores.
"""
# Effective axis where apply_along_axis() will be applied by each
# worker (any non-zero axis number would work, so as to allow the use
# of `np.array_split()`, which is only done on axis 0):
effective_axis = 1 if axis == 0 else axis
if effective_axis != axis:
arr = arr.swapaxes(axis, effective_axis)
# Chunks for the mapping (only a few chunks):
chunks = [(func1d, effective_axis, sub_arr, args, kwargs)
for sub_arr in np.array_split(arr, multiprocessing.cpu_count())]
pool = multiprocessing.Pool()
individual_results = pool.map(unpacking_apply_along_axis, chunks)
# Freeing the workers:
pool.close()
pool.join()
return np.concatenate(individual_results)
where the function unpacking_apply_along_axis()
being applied in Pool.map()
is separate as it should (so that subprocesses can import it), and is simply a thin wrapper that handles the fact that Pool.map()
only takes a single argument:
def unpacking_apply_along_axis((func1d, axis, arr, args, kwargs)):
"""
Like numpy.apply_along_axis(), but with arguments in a tuple
instead.
This function is useful with multiprocessing.Pool().map(): (1)
map() only handles functions that take a single argument, and (2)
this function can generally be imported from a module, as required
by map().
"""
return np.apply_along_axis(func1d, axis, arr, *args, **kwargs)
(in Python 3, this should be written as
def unpacking_apply_along_axis(all_args):
(func1d, axis, arr, args, kwargs) = all_args
because argument unpacking was removed).
In my particular case, this resulted in a 2x speedup on 2 cores with hyper-threading. A factor closer to 4x would have been nicer, but the speed up is already nice, in just a few lines of code, and it should be better for machines with more cores (which are quite common). Maybe there is a way of avoiding data copies and using shared memory (maybe through the multiprocessing
module itself)?
Numpy.apply_along_axis works unexpectedly when applying a function with if else condition
Your function f
returns integers.
you have to use:
def f(arr):
return float(0 if arr[-1] == arr[0] else abs(arr[-1]-arr[0]))
[[0 1 2 3]
[1 2 3 4]
[2 3 4 5]]
[[27.75 27.71 28.05 27.75]
[27.71 28.05 27.75 26.55]
[28.05 27.75 26.55 27.18]]
[0. 1.16 0.87]
P.S
your function f
can be generalized to simple return abs(arr[-1]-arr[0])
as it covers the 0
case. You don't need the if
statement.
Numpy apply along axis based on row index
When iterating on array, directly or with apply_along_axis
, the subarray does not have a .index
attribute. So we have to pass an explicit index value to your function:
In [248]: def func(i,x):
...: if i//2==0:
...: x = x+10
...: else:
...: x = x+50
...: return x
...:
In [249]: arr = np.arange(10).reshape(5,2)
apply
doesn't have a way to add this index, so instead we have to use an explicit iteration.
In [250]: np.array([func(i,v) for i,v in enumerate(arr)])
Out[250]:
array([[10, 11],
[12, 13],
[54, 55],
[56, 57],
[58, 59]])
replacing // with %
In [251]: def func(i,x):
...: if i%2==0:
...: x = x+10
...: else:
...: x = x+50
...: return x
...:
In [252]: np.array([func(i,v) for i,v in enumerate(arr)])
Out[252]:
array([[10, 11],
[52, 53],
[14, 15],
[56, 57],
[18, 19]])
But a better way is to skip the iteration entirely:
Make an array of the row additions:
In [253]: np.where(np.arange(5)%2,10,50)
Out[253]: array([50, 10, 50, 10, 50])
apply it via broadcasting
:
In [256]: x+np.where(np.arange(5)%2,50,10)[:,None]
Out[256]:
array([[10, 11],
[52, 53],
[14, 15],
[56, 57],
[18, 19]])
Related Topics
Converting "Yield From" Statement to Python 2.7 Code
Installing Numpy with Pip on Windows 10 for Python 3.7
Python Argparse - Add Argument to Multiple Subparsers
Convert Structured Array to Regular Numpy Array
Opencv - Apply Mask to a Color Image
How to Initialize Weights in Pytorch
Longest Common Substring from More Than Two Strings
How to Escape Latex Code Received Through User Input
Getting the Indices of Several Elements in a Numpy Array at Once
How to Make Custom Legend in Matplotlib
Remove a Tag Using Beautifulsoup But Keep Its Contents
Datetime from String in Python, Best-Guessing String Format
Example of the Right Way to Use Qthread in Pyqt
How to Find First Non-Zero Value in Every Column of a Numpy Array
Having Trouble Making a List of Lists of a Designated Size
Pandas Filling Missing Dates and Values Within Group
How to Install Pip for Python 3 on MAC Os X
Django Query That Get Most Recent Objects from Different Categories