﻿ How to Normalize a Numpy Array to Within a Certain Range - ITCodar

# How to Normalize a Numpy Array to Within a Certain Range

## Scale Numpy array to certain range

After asking on CodeReview, I was informed there is a built-in `np.interp` that accomplishes this:

``np.interp(a, (a.min(), a.max()), (-1, +1))``

I've left my old answer below for the sake of posterity.

I made my own function based off of the `D3.js` code in this answer:

``import numpy as npdef d3_scale(dat, out_range=(-1, 1)):    domain = [np.min(dat, axis=0), np.max(dat, axis=0)]    def interp(x):        return out_range * (1.0 - x) + out_range * x    def uninterp(x):        b = 0        if (domain - domain) != 0:            b = domain - domain        else:            b =  1.0 / domain        return (x - domain) / b    return interp(uninterp(dat))print(d3_scale(np.array([-2, 0, 2], dtype=np.float)))print(d3_scale(np.array([-3, -2, -1], dtype=np.float)))``

## Min-max normalisation of a NumPy array

Referring to this Cross Validated Link, How to normalize data to 0-1 range?, it looks like you can perform min-max normalisation on the last column of `foo`.

``v = foo[:, 1]   # foo[:, -1] for the last columnfoo[:, 1] = (v - v.min()) / (v.max() - v.min())``
``fooarray([[ 0.        ,  0.        ],       [ 0.13216   ,  0.06609523],       [ 0.25379   ,  1.        ],       [ 0.30874   ,  0.09727968]])``

Another option for performing normalisation (as suggested by OP) is using `sklearn.preprocessing.normalize`, which yields slightly different results -

``from sklearn.preprocessing import normalizefoo[:, [-1]] = normalize(foo[:, -1, None], norm='max', axis=0)``
``fooarray([[ 0.        ,  0.2378106 ],       [ 0.13216   ,  0.28818769],       [ 0.25379   ,  1.        ],       [ 0.30874   ,  0.31195614]])``

## how numpy.ndarray can be normalized?

Vectorized is much faster than iterative

If you want to scale the pixel values of all your images using `numpy` arrays only, you may want to keep the vectorized nature of the operation (by avoiding loops).

Here is a way to scale your images :

``# Getting min and max per imagemaxis = images.max(axis=(1,2,3))minis = images.min(axis=(1,2,3))# Scaling without any loopscaled_images = ((images.T - minis) / (maxis - minis) * 255).T# timeit > 178 µs ± 1.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)``

The transposes `.T` were necessary here to broadcast correctly the subtraction.

We can check if this is correct:

``print((scaled_images.min(axis=(1,2,3)) == 0).all())# > Trueprint((scaled_images.max(axis=(1,2,3)) == 255).all())# > True``
Scaling into the [0, 1] range

If you want pixel values between `0`and `1`, we simply remove the x255 multiplication:

``scaled_images = ((images.T - minis) / (maxis - minis)).T``
Only with numpy arrays and such

You must also make sure you are handling a `numpy array` in the first place, not a `list` :

``import numpy as npimages = np.array(images)``
OpenCV
On-the-go scaling

Since you are using `opencv` to read your images one by one, you can normalize your images on the go with it:

``inputPath='E:/Notebooks/data'max_scale = 1   # or 255 if needed# Load in the images images = [cv2.normalize(    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)),    None, 0, max_scale, cv2.NORM_MINMAX)    for filepath in os.listdir(inputPath)]``
Make sure you have images in the folder
``inputPath='E:/Notebooks/data'images = []max_scale = 1   # or 255 if needed# Load in the images for filepath in os.listdir(inputPath):    image = cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH))    # Scale and append the list if it is an image    if image is not None:        images.append(cv2.normalize(image, None, 0, max_scale, cv2.NORM_MINMAX))``
Bug on versions of open-cv prior to 3.4

As reported here, there is a bug with opencv's `normalize` method producing values below the `alpha parameter`. It was corrected on version 3.4.

Here is a way to scale images on-the-go with older versions of open-cv:

``def custom_scale(img, max_scale=1):    mini = img.min()    return (img - mini) / (img.max() - mini) * max_scalemax_scale = 1   # or 255 if neededimages = [custom_scale(    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)), max_scale)    for filepath in os.listdir(inputPath)]``

## How to scale a numpy array from 0 to 1 with overshoot?

First, transform the DataFrame to a numpy array

``import numpy as npT = np.array(df['Temp'])``

Then scale it to a [0, 1] interval:

``def scale(A):    return (A-np.min(A))/(np.max(A) - np.min(A))T_scaled = scale(T)``

Then transform it to anywhere you want, e.g. to [55..100]

``T2 = 55 + 45*T_scaled``

I'm sure that this can be done within Pandas too (but I'm not familiar with it). Perhaps you might study Pandas `df.apply()`

## How To Normalize Array Between 1 and 10?

Your range is actually 9 long: from 1 to 10. If you multiply the normalized array by 9 you get values from 0 to 9, which you need to shift back by 1:

``start = 1end = 10width = end - startres = (arr - arr.min())/(arr.max() - arr.min()) * width + start``

Note that the denominator here has a numpy built-in named `arr.ptp()`:

``res = (arr - arr.min())/arr.ptp() * width + start``