Numpy: Fix array with rows of different lengths by filling the empty elements with zeros
This could be one approach -
def numpy_fillna(data):
# Get lengths of each row of data
lens = np.array([len(i) for i in data])
# Mask of valid places in each row
mask = np.arange(lens.max()) < lens[:,None]
# Setup output array and put elements from data into masked positions
out = np.zeros(mask.shape, dtype=data.dtype)
out[mask] = np.concatenate(data)
return out
Sample input, output -
In [222]: # Input object dtype array
...: data = np.array([[1, 2, 3, 4],
...: [2, 3, 1],
...: [5, 5, 5, 5, 8 ,9 ,5],
...: [1, 1]])
In [223]: numpy_fillna(data)
Out[223]:
array([[1, 2, 3, 4, 0, 0, 0],
[2, 3, 1, 0, 0, 0, 0],
[5, 5, 5, 5, 8, 9, 5],
[1, 1, 0, 0, 0, 0, 0]], dtype=object)
Fill array with rows of different lenghts Python
I think this should do it:
def fill(a):
length = max([len(i) for i in a])
return [[0]*(length-len(i)) + i for i in a]
fill(mylist)
#[[0,0,1], [0,1,2], [1,2,3]]
python: padding with zero in the end of every array in Numpy array of arrays
For numpy.pad solution I think we need to ensure your input is exactly as you have it so we can get a proper solution. Then it will just be:
a=[
np.asarray([1,2,3,4]),
np.asarray([3,56]),
np.asarray([8,4,8,4,9,33,55])
]
max_len = max([len(x) for x in a])
output = [np.pad(x, (0, max_len - len(x)), 'constant') for x in a]
print(output)
>>> [
array([1, 2, 3, 4, 0, 0, 0]),
array([ 3, 56, 0, 0, 0, 0, 0]),
array([ 8, 4, 8, 4, 9, 33, 55])
]
python how to pad numpy array with zeros
Very simple, you create an array containing zeros using the reference shape:
result = np.zeros(b.shape)
# actually you can also use result = np.zeros_like(b)
# but that also copies the dtype not only the shape
and then insert the array where you need it:
result[:a.shape[0],:a.shape[1]] = a
and voila you have padded it:
print(result)
array([[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
You can also make it a bit more general if you define where your upper left element should be inserted
result = np.zeros_like(b)
x_offset = 1 # 0 would be what you wanted
y_offset = 1 # 0 in your case
result[x_offset:a.shape[0]+x_offset,y_offset:a.shape[1]+y_offset] = a
result
array([[ 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 1., 1.],
[ 0., 1., 1., 1., 1., 1.],
[ 0., 1., 1., 1., 1., 1.]])
but then be careful that you don't have offsets bigger than allowed. For x_offset = 2
for example this will fail.
If you have an arbitary number of dimensions you can define a list of slices to insert the original array. I've found it interesting to play around a bit and created a padding function that can pad (with offset) an arbitary shaped array as long as the array and reference have the same number of dimensions and the offsets are not too big.
def pad(array, reference, offsets):
"""
array: Array to be padded
reference: Reference array with the desired shape
offsets: list of offsets (number of elements must be equal to the dimension of the array)
"""
# Create an array of zeros with the reference shape
result = np.zeros(reference.shape)
# Create a list of slices from offset to offset + shape in each dimension
insertHere = [slice(offset[dim], offset[dim] + array.shape[dim]) for dim in range(a.ndim)]
# Insert the array in the result at the specified offsets
result[insertHere] = a
return result
And some test cases:
import numpy as np
# 1 Dimension
a = np.ones(2)
b = np.ones(5)
offset = [3]
pad(a, b, offset)
# 3 Dimensions
a = np.ones((3,3,3))
b = np.ones((5,4,3))
offset = [1,0,0]
pad(a, b, offset)
Filling empty list with zero vector using numpy
Edit: I didn't realize that all of the non-empty features were the same length. If that is the case then you can just use the length of the first non-zero one. I added a function that does that.
f0 = [0,1,2]
f1 = []
f2 = [4,5,6]
features = [f0, f1, f2]
def get_nonempty_len(features):
"""
returns the length of the first non-empty element
of features.
"""
for f in features:
if len(f) > 0:
return len(f)
return 0
def generate_matrix(features):
rows = len(features)
cols = get_nonempty_len(features)
m = np.zeros((rows, cols))
for i, f in enumerate(features):
m[i,:len(f)]=f
return m
print(generate_matrix(features))
Output looks like:
[[ 0. 1. 2.]
[ 0. 0. 0.]
[ 4. 5. 6.]]
Zero pad numpy array
For your use case you can use resize() method:
A = np.array([1,2,3,4,5])
A.resize(8)
This resizes A
in place. If there are refs to A
numpy throws a vale error because the referenced value would be updated too. To allow this add refcheck=False
option.
The documentation states that missing values will be 0
:
Enlarging an array: as above, but missing entries are filled with zeros
How to make a multidimension numpy array with a varying row size?
While Numpy knows about arrays of arbitrary objects, it's optimized for homogeneous arrays of numbers with fixed dimensions. If you really need arrays of arrays, better use a nested list. But depending on the intended use of your data, different data structures might be even better, e.g. a masked array if you have some invalid data points.
If you really want flexible Numpy arrays, use something like this:
numpy.array([[0,1,2,3], [2,3,4]], dtype=object)
However this will create a one-dimensional array that stores references to lists, which means that you will lose most of the benefits of Numpy (vector processing, locality, slicing, etc.).
NumPy array initialization (fill with identical values)
NumPy 1.8 introduced np.full()
, which is a more direct method than empty()
followed by fill()
for creating an array filled with a certain value:
>>> np.full((3, 5), 7)
array([[ 7., 7., 7., 7., 7.],
[ 7., 7., 7., 7., 7.],
[ 7., 7., 7., 7., 7.]])
>>> np.full((3, 5), 7, dtype=int)
array([[7, 7, 7, 7, 7],
[7, 7, 7, 7, 7],
[7, 7, 7, 7, 7]])
This is arguably the way of creating an array filled with certain values, because it explicitly describes what is being achieved (and it can in principle be very efficient since it performs a very specific task).
Related Topics
What's 0Xff for in Cv2.Waitkey(1)
Logging, Streamhandler and Standard Streams
Why Does Python's Multiprocessing Module Import _Main_ When Starting a New Process on Windows
Is There an Expression for an Infinite Iterator
Matplotlib Scatter Plot with Legend
How to Create a Spinning Command Line Cursor
How to Call Python Code from C Code
Pip Broke. How to Fix Distributionnotfound Error
Nltk Naivebayesclassifier Training for Sentiment Analysis
Django-Registration & Django-Profile, Using Your Own Custom Form
Activate Python Virtualenv in Dockerfile
Python Module to Change System Date and Time
Possibilities for Python Classes Organized Across Files
Fitting a 2D Gaussian Function Using Scipy.Optimize.Curve_Fit - Valueerror and Minpack.Error