Vectorized Numpy Linspace for Multiple Start and Stop Values

Vectorized NumPy linspace for multiple start and stop values

Here's an approach using broadcasting -

def create_ranges(start, stop, N, endpoint=True):
    if endpoint==1:
        divisor = N-1
    else:
        divisor = N
    steps = (1.0/divisor) * (stop - start)
    return steps[:,None]*np.arange(N) + start[:,None]

Sample run -

In [22]: # Setup start, stop for each row and no. of elems in each row
    ...: start = np.array([1,4,2])
    ...: stop  = np.array([6,7,6])
    ...: N = 5
    ...: 

In [23]: create_ranges(start, stop, 5)
Out[23]: 
array([[ 1.  ,  2.25,  3.5 ,  4.75,  6.  ],
       [ 4.  ,  4.75,  5.5 ,  6.25,  7.  ],
       [ 2.  ,  3.  ,  4.  ,  5.  ,  6.  ]])

In [24]: create_ranges(start, stop, 5, endpoint=False)
Out[24]: 
array([[ 1. ,  2. ,  3. ,  4. ,  5. ],
       [ 4. ,  4.6,  5.2,  5.8,  6.4],
       [ 2. ,  2.8,  3.6,  4.4,  5.2]])

Let's leverage multi-core!

We can leverage multi-core with numexpr module for large data and to gain memory efficiency and hence performance -

import numexpr as ne

def create_ranges_numexpr(start, stop, N, endpoint=True):
    if endpoint==1:
        divisor = N-1
    else:
        divisor = N
    s0 = start[:,None]
    s1 = stop[:,None]
    r = np.arange(N)
    return ne.evaluate('((1.0/divisor) * (s1 - s0))*r + s0')

Vectorized NumPy linspace across multi-dimensional arrays

Here's one vectorized approach based on this post to cover for generic n-dim cases -

def create_ranges_nd(start, stop, N, endpoint=True):
    if endpoint==1:
        divisor = N-1
    else:
        divisor = N
    steps = (1.0/divisor) * (stop - start)
    return start[...,None] + steps[...,None]*np.arange(N)

Sample run -

In [536]: mins = np.array([[3,5],[2,4]])

In [537]: maxs = np.array([[13,16],[11,12]])

In [538]: create_ranges_nd(mins, maxs, 6)
Out[538]: 
array([[[  3. ,   5. ,   7. ,   9. ,  11. ,  13. ],
        [  5. ,   7.2,   9.4,  11.6,  13.8,  16. ]],

       [[  2. ,   3.8,   5.6,   7.4,   9.2,  11. ],
        [  4. ,   5.6,   7.2,   8.8,  10.4,  12. ]]])

How to apply linspace between each element in numpy vector?

You are looking for a linear interpolation for a 1-d array, which can be done using NumPy.interp.

s = 4       # number of intervals between two numbers
l = (a.size - 1) * s + 1          # total length after interpolation
np.interp(np.arange(l), np.arange(l, step=s), a)        # interpolate

# array([1.  , 1.75, 2.5 , 3.25, 4.  , 3.5 , 3.  , 2.5 , 2.  ])

How can I vectorize linspace in numpy

Build your own:

def vlinspace(a, b, N, endpoint=True):
    a, b = np.asanyarray(a), np.asanyarray(b)
    return a[..., None] + (b-a)[..., None]/(N-endpoint) * np.arange(N)

Numpy modify each array in multidimensional array with arange

Since by necessity all of the aranges need to be equally long, we can create an arange along the first entry and then replicate it for the others.

For example:

x = np.array([[78, 82],
              [90, 94],
              [102, 106]])

>>> x[:, :1] + np.arange(0, 1 + x[0, 1] - x[0, 0])
# array([[ 78,  79,  80,  81],
#        [ 90,  91,  92,  93],
#        [102, 103, 104, 105]])

Get Start and Stop Values For Incrementing Groups in NumPy Vector

If your input is called x:

r = np.full(len(x),2)
d = np.diff(x)==1
r[1:]-=d
r[:-1]-=d 
np.repeat(x,r).reshape(-1,2)

Output:

array([[  1,   2],
       [  6,   6],
       [ 12,  14],
       [ 16,  16],
...

This works by repeating each item twice (default) but subtract 1 time for each left or right direct neighbor: So if I'm at the left or right end of a stretch I get repeated once, if I'm inside I get repeated zero times.

Numpy-vectorized function to repeat blocks of consecutive elements

Here's one vectorized approach using cumsum -

# Get repeats for each group using group lengths/sizes
r1 = np.repeat(np.arange(len(sizes)), repeats)

# Get total size of output array, as needed to initialize output indexing array
N = (sizes*repeats).sum() # or np.dot(sizes, repeats)

# Initialize indexing array with ones as we need to setup incremental indexing
# within each group when cumulatively summed at the final stage. 
# Two steps here:
# 1. Within each group, we have multiple sequences, so setup the offsetting
# at each sequence lengths by the seq. lengths preceeeding those.
id_ar = np.ones(N, dtype=int)
id_ar[0] = 0
insert_index = sizes[r1[:-1]].cumsum()
insert_val = (1-sizes)[r1[:-1]]

# 2. For each group, make sure the indexing starts from the next group's
# first element. So, simply assign 1s there.
insert_val[r1[1:] != r1[:-1]] = 1

# Assign index-offseting values
id_ar[insert_index] = insert_val

# Finally index into input array for the group repeated o/p
out = a[id_ar.cumsum()]

Why do the values in the for loop not correspond to those in Numpys arange and linspace?

TL/DR: Its a display issue with numpy.ndarray displaying things differently - you can customize the printing:

import numpy as np
import sys
np.set_printoptions(precision=20)
seq = np.linspace(0.01, 0.09,9)
print(seq)

[0.01                 0.02                 0.03
 0.04                 0.05                 0.060000000000000005
 0.06999999999999999  0.08                 0.09                ]

See How to pretty-print a numpy.array without scientific notation and with given precision?

You still got floats inside and "Is floating point math broken?" applies:

import numpy as np
seq = np.linspace(0.01, 0.09,9)
seq2 = ([])
for i in seq:
    seq2.append(i)

print(*seq)
print(*seq2)

Output:

0.01 0.02 0.03 0.04 0.05 0.060000000000000005 0.06999999999999999 0.08 0.09
0.01 0.02 0.03 0.04 0.05 0.060000000000000005 0.06999999999999999 0.08 0.09

Creating vectorized numpy.meshgrid off 2D arrays to create 3D meshes

Here's one using vectorized-linspace : create_ranges -

# https://stackoverflow.com/a/40624614/ @Divakar
def create_ranges(start, stop, N, endpoint=True):
    if endpoint==1:
        divisor = N-1
    else:
        divisor = N
    steps = (1.0/divisor) * (stop - start)
    return steps[:,None]*np.arange(N) + start[:,None]

def linspace_nd(x,y,gridrez):
    a1 = create_ranges(x.min(1), x.max(1), N=gridrez, endpoint=True)
    a2 = create_ranges(y.min(1), y.max(1), N=gridrez, endpoint=True)
    out_shp = a1.shape + (a2.shape[1],)
    Xout = np.broadcast_to(a1[:,None,:], out_shp)
    Yout = np.broadcast_to(a2[:,:,None], out_shp)
    return Xout, Yout

The final outputs off linspace_nd would be 3D mesh views into the vectorized linspace outputs and as such would be memory-efficient and hence good on performance too.

Alternatively, if you need outputs with their own memory spaces and not the views, you can use np.repeat for the replications -

Xout = np.repeat(a1[:,None,:],a2.shape[1],axis=1)
Yout = np.repeat(a2[:,:,None],a1.shape[1],axis=2)

Timings to create such an array with views -

In [406]: np.random.seed(0)
     ...: x = np.random.rand(1000,5)
     ...: y = np.random.rand(1000,5)

In [408]: %timeit linspace_nd(x,y,gridrez=10)
1000 loops, best of 3: 221 µs per loop

Vectorized Numpy Linspace for Multiple Start and Stop Values