How to Calculate a Gaussian Kernel Matrix Efficiently in Numpy

How to calculate a Gaussian kernel matrix efficiently in numpy?

Do you want to use the Gaussian kernel for e.g. image smoothing? If so, there's a function gaussian_filter() in scipy:

Updated answer

This should work - while it's still not 100% accurate, it attempts to account for the probability mass within each cell of the grid. I think that using the probability density at the midpoint of each cell is slightly less accurate, especially for small kernels. See https://homepages.inf.ed.ac.uk/rbf/HIPR2/gsmooth.htm for an example.

import numpy as np
import scipy.stats as st

def gkern(kernlen=21, nsig=3):
    """Returns a 2D Gaussian kernel."""

    x = np.linspace(-nsig, nsig, kernlen+1)
    kern1d = np.diff(st.norm.cdf(x))
    kern2d = np.outer(kern1d, kern1d)
    return kern2d/kern2d.sum()

Testing it on the example in Figure 3 from the link:

gkern(5, 2.5)*273

gives

array([[ 1.0278445 ,  4.10018648,  6.49510362,  4.10018648,  1.0278445 ],
       [ 4.10018648, 16.35610171, 25.90969361, 16.35610171,  4.10018648],
       [ 6.49510362, 25.90969361, 41.0435344 , 25.90969361,  6.49510362],
       [ 4.10018648, 16.35610171, 25.90969361, 16.35610171,  4.10018648],
       [ 1.0278445 ,  4.10018648,  6.49510362,  4.10018648,  1.0278445 ]])

The original (accepted) answer below accepted is wrong
The square root is unnecessary, and the definition of the interval is incorrect.

import numpy as np
import scipy.stats as st

def gkern(kernlen=21, nsig=3):
    """Returns a 2D Gaussian kernel array."""

    interval = (2*nsig+1.)/(kernlen)
    x = np.linspace(-nsig-interval/2., nsig+interval/2., kernlen+1)
    kern1d = np.diff(st.norm.cdf(x))
    kernel_raw = np.sqrt(np.outer(kern1d, kern1d))
    kernel = kernel_raw/kernel_raw.sum()
    return kernel

efficiently generate shifted gaussian kernel in python

A reasonably fast approach is to note that the Gaussian is separable, so you can calculate the 1D gaussian for x and y and then take the outer product:

import numpy as np
import matplotlib.pyplot as plt

x0, y0, sigma = 5.5, 4.2, 1.4

x, y = np.arange(9), np.arange(9)

gx = np.exp(-(x-x0)**2/(2*sigma**2))
gy = np.exp(-(y-y0)**2/(2*sigma**2))
g = np.outer(gx, gy)
g /= np.sum(g)  # normalize, if you want that

plt.imshow(g, interpolation="nearest", origin="lower")
plt.show()

enter image description here

What is the fastest way to compute an RBF kernel in python?

Well you are doing a lot of optimizations in your answer post. I would like to add few more (mostly tweaks). I would build upon the winner from the answer post, which seems to be numexpr based on.

Tweak #1

First off, np.sum(X ** 2, axis = -1) could be optimized with np.einsum. Though this part isn't the biggest overhead, but optimization of any sort won't hurt. So, that summation could be expressed as -

X_norm = np.einsum('ij,ij->i',X,X)

Tweak #2

Secondly, we could leverage Scipy supported blas functions and if allowed use single-precision dtype for noticeable performance improvement over its double precision one. Hence, np.dot(X, X.T) could be computed with SciPy's sgemm like so -

sgemm(alpha=1.0, a=X, b=X, trans_b=True)

Few more tweaks on rearranging the negative sign with gamma lets us feed more to sgemm. Also, we would push in gamma into the alpha term.

Tweaked implementations

Thus, with these two optimizations, we would have two more variants (if I could put it that way) of the numexpr method, listed below -

from scipy.linalg.blas import sgemm

def app1(X, gamma, var):
    X_norm = -np.einsum('ij,ij->i',X,X)
    return ne.evaluate('v * exp(g * (A + B + 2 * C))', {\
        'A' : X_norm[:,None],\
        'B' : X_norm[None,:],\
        'C' : np.dot(X, X.T),\
        'g' : gamma,\
        'v' : var\
    })

def app2(X, gamma, var):
    X_norm = -gamma*np.einsum('ij,ij->i',X,X)
    return ne.evaluate('v * exp(A + B + C)', {\
        'A' : X_norm[:,None],\
        'B' : X_norm[None,:],\
        'C' : sgemm(alpha=2.0*gamma, a=X, b=X, trans_b=True),\
        'g' : gamma,\
        'v' : var\
    })

Runtime test

Numexpr based one from your answer post -

def app0(X, gamma, var):
    X_norm = np.sum(X ** 2, axis = -1)
    return ne.evaluate('v * exp(-g * (A + B - 2 * C))', {
            'A' : X_norm[:,None],
            'B' : X_norm[None,:],
            'C' : np.dot(X, X.T),
            'g' : gamma,
            'v' : var
    })

Timings and verification -

In [165]: # Setup
     ...: X = np.random.randn(10000, 512)
     ...: gamma = 0.01
     ...: var = 5.0

In [166]: %timeit app0(X, gamma, var)
     ...: %timeit app1(X, gamma, var)
     ...: %timeit app2(X, gamma, var)
1 loop, best of 3: 1.25 s per loop
1 loop, best of 3: 1.24 s per loop
1 loop, best of 3: 973 ms per loop

In [167]: np.allclose(app0(X, gamma, var), app1(X, gamma, var))
Out[167]: True

In [168]: np.allclose(app0(X, gamma, var), app2(X, gamma, var))
Out[168]: True

Generate a Gaussian kernel given mean and standard deviation

You could use astropy, especially the Gaussian2D model from the astropy.modeling.models module:

from astropy.modeling.models import Gaussian2D

g2d = Gaussian2D(x_mean=8, y_mean=10, x_stddev=3, y_stddev=3)  # specify properties

g2d(*np.mgrid[0:100, 0:100])  # specify the grid for the array

enter image description here

Linearly separating a Gaussian Filter and calculating with Numpy

For anyone interested, the problem was from the fact that The function gaussianKernel returned the 2d kernel normalised for use as a 2d kernel. This meant that when I split it up into its row and column components by taking the top row and left column, these components were not normalised.

To solve this, I just added a parameter to the gaussianKernel function to select 2 dimensions or 1 dimensions (both normalised correctly):

def gaussianKernel(size, sigma, twoDimensional=True):
    if twoDimensional:
        kernel = np.fromfunction(lambda x, y: (1/(2*math.pi*sigma**2)) * math.e ** ((-1*((x-(size-1)/2)**2+(y-(size-1)/2)**2))/(2*sigma**2)), (size, size))
    else:
        kernel = np.fromfunction(lambda x: math.e ** ((-1*(x-(size-1)/2)**2) / (2*sigma**2)), (size,))
    return kernel / np.sum(kernel)

So now I can get just the 1d kernel with gaussianKernel(size, sigma, False) , and have it be normalised correctly. This means I can finally get the right blurring effect without scaled pixel values.

how to get the gaussian filter?

If you are looking for a "python"ian way of creating a 2D Gaussian filter, you can create it by dot product of two 1D Gaussian filter.

Creating a single 1x5 Gaussian Filter

x = np.linspace(0, 5, 5, endpoint=False)
y = multivariate_normal.pdf(x, mean=2, cov=0.5)

Then change it into a 2D array

import numpy as np
y = y.reshape(1,5)

Dot product the y with its self to create a symmetrical 2D Gaussian Filter

GF = np.dot(y.T,y)

How to efficiently compute the heat map of two Gaussian distribution in Python?

Your approach is fine other than that you shouldn't loop over norm.pdf but just push all values at which you want the kernel(s) evaluated, and then reshape the output to the desired shape of the image.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal

# create 2 kernels
m1 = (-1,-1)
s1 = np.eye(2)
k1 = multivariate_normal(mean=m1, cov=s1)

m2 = (1,1)
s2 = np.eye(2)
k2 = multivariate_normal(mean=m2, cov=s2)

# create a grid of (x,y) coordinates at which to evaluate the kernels
xlim = (-3, 3)
ylim = (-3, 3)
xres = 100
yres = 100

x = np.linspace(xlim[0], xlim[1], xres)
y = np.linspace(ylim[0], ylim[1], yres)
xx, yy = np.meshgrid(x,y)

# evaluate kernels at grid points
xxyy = np.c_[xx.ravel(), yy.ravel()]
zz = k1.pdf(xxyy) + k2.pdf(xxyy)

# reshape and plot image
img = zz.reshape((xres,yres))
plt.imshow(img); plt.show()

enter image description here

This approach shouldn't take too long:

In [26]: %timeit zz = k1.pdf(xxyy) + k2.pdf(xxyy)
1000 loops, best of 3: 1.16 ms per loop

How to perform Gaussian pooling on a 2d array using numpy

First transform you M x N matrix into a (M//K) x K x (N//K) x K array,
then pointwise multiply with the kernel at the second and fourth dimensions,
then sum at the second and fourth dimensions.

np.sum(
    matrix.reshape((
        matrix.shape[-2] // kernel.shape[-2], kernel.shape[-2],
        matrix.shape[-1] // kernel.shape[-1], kernel.shape[-1],
    ))
    * kernel[np.newaxis, :, np.newaxis, :],
    axis=(-3, -1),
)

You can also replace the pointwise-multiply-then-sum by a np.tensordot call.

np.tensordot(
    matrix.reshape((
        matrix.shape[-2] // kernel.shape[-2], kernel.shape[-2],
        matrix.shape[-1] // kernel.shape[-1], kernel.shape[-1],
    )),
    kernel,
    axes=(
        (-3, -1),
        (-2, -1),
    )
)

How to Calculate a Gaussian Kernel Matrix Efficiently in Numpy