Convolve2D Just by Using Numpy

Convolve2d just by using Numpy

You could generate the subarrays using as_strided:

import numpy as np

a = np.array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

sub_shape = (3,3)
view_shape = tuple(np.subtract(a.shape, sub_shape) + 1) + sub_shape
strides = a.strides + a.strides

sub_matrices = np.lib.stride_tricks.as_strided(a,view_shape,strides)

To get rid of your second "ugly" sum, alter your einsum so that the output array only has j and k. This implies your second summation.

conv_filter = np.array([[0,-1,0],[-1,5,-1],[0,-1,0]])
m = np.einsum('ij,ijkl->kl',conv_filter,sub_matrices)

# [[ 6  7  8]
#  [11 12 13]
#  [16 17 18]]

Strided convolution of 2D in numpy

Ignoring the padding argument and trailing windows that won't have enough lengths for convolution against the second array, here's one way with np.lib.stride_tricks.as_strided -

def strided4D(arr,arr2,s):
    strided = np.lib.stride_tricks.as_strided
    s0,s1 = arr.strides
    m1,n1 = arr.shape
    m2,n2 = arr2.shape    
    out_shp = (1+(m1-m2)//s, m2, 1+(n1-n2)//s, n2)
    return strided(arr, shape=out_shp, strides=(s*s0,s*s1,s0,s1))

def stride_conv_strided(arr,arr2,s):
    arr4D = strided4D(arr,arr2,s=s)
    return np.tensordot(arr4D, arr2, axes=((2,3),(0,1)))

Alternatively, we can use the scikit-image built-in view_as_windows to get those windows elegantly, like so -

from skimage.util.shape import view_as_windows

def strided4D_v2(arr,arr2,s):
    return view_as_windows(arr, arr2.shape, step=s)

Batch convolution 2d in numpy without scipy?

Just need to get shape to be 5d and get the strides to match the shape.

shape = f.shape + (x.shape[0],) + tuple(np.subtract(x.shape[1:], f.shape) + 1)
strides = (x.strides * 2)[1:]
M = np.lib.stride_tricks.as_strided(x, shape=shape, strides=strides)
y = np.einsum('pq,pqbmn->bmn', f, M)

now M might get really big if b gets really big, but it works on your toy problem.

Fastest 2D convolution or image filter in Python

It really depends on what you want to do... A lot of the time, you don't need a fully generic (read: slower) 2D convolution... (i.e. If the filter is separable, you use two 1D convolutions instead... This is why the various scipy.ndimage.gaussian, scipy.ndimage.uniform, are much faster than the same thing implemented as a generic n-D convolutions.)

At any rate, as a point of comparison:

t = timeit.timeit(stmt='ndimage.convolve(x, y, output=x)', number=1,
setup="""
import numpy as np
from scipy import ndimage
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t

This takes 6.9 sec on my machine...

Compare this with fftconvolve

t = timeit.timeit(stmt="signal.fftconvolve(x, y, mode='same')", number=1,
setup="""
import numpy as np
from scipy import signal
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t

This takes about 10.8 secs. However, with different input sizes, using fft's to do a convolution can be considerably faster (Though I can't seem to come up with a good example, at the moment...).

Convolve2D Just by Using Numpy