Replacing Numpy elements if condition is met
>>> import numpy as np
>>> a = np.random.randint(0, 5, size=(5, 4))
>>> a
array([[4, 2, 1, 1],
[3, 0, 1, 2],
[2, 0, 1, 1],
[4, 0, 2, 3],
[0, 0, 0, 2]])
>>> b = a < 3
>>> b
array([[False, True, True, True],
[False, True, True, True],
[ True, True, True, True],
[False, True, True, False],
[ True, True, True, True]], dtype=bool)
>>>
>>> c = b.astype(int)
>>> c
array([[0, 1, 1, 1],
[0, 1, 1, 1],
[1, 1, 1, 1],
[0, 1, 1, 0],
[1, 1, 1, 1]])
You can shorten this with:
>>> c = (a < 3).astype(int)
Replacing Numpy elements with the closest if condition is met
pandas has padding ability, which is what you are describing, but you would have to cast your array as a float because numpy int arrays cannot hold np.nan
values.
import pandas as pd
import numpy as np
upper = 10
lower = 1
v=np.array([1,-77,3,4,5,13213,6,7,8,1024])
s = pd.Series(v)
s[~((s>lower) & (s<upper))] = np.nan
s = s.fillna(method='pad')
# at this point the series is padded but the values are floats instead of
# ints, you can cast back to an int array if you wish
v2 = s.values.astype(int)
v2
# outputs:
array([1, 1, 3, 4, 5, 5, 6, 7, 8, 8])
update:
a numpy only solution
# first we identify elements that are out of bounds and need to be filled from the data
mask = (v<lower) | (v>upper)
oob = np.where(mask)[0]
# for each oob value, we calculate the index that provides the fill-value using a forward fill or backward fill
def fillidx(i, mask_oob):
try:
if i == 0 or np.all(mask_oob[:i]):
# all elements from start are oob
raise IndexError()
n = -1 * (1 + np.argmin(mask_oob[:i][::-1]))
except (IndexError):
n = 1 + np.argmin(mask_oob[i+1:])
return i + n
fill = [fillidx(i, mask) for i in oob]
v[mask] = v[fill]
print(v)
with the first test array v = np.array([1,-77,3,4,5,13213,6,7,8,1024])
, the following is output:
[1 1 3 4 5 5 6 7 8 8]
with the second test array v = np.array([-7,1,2,3,-77])
the following is output:
[1 1 2 3 3]
with an array where consecutive values are out of bounds and the first few elements are also out of bounds, i.e. v = np.array([-200,20,1,-77,3,4,5,13213,-200,6,7,8,1024])
we get the following output:
[1 1 1 1 3 4 5 5 5 6 7 8 8]
Replacing elements in a numpy array when there are multiple conditions
A straight forward solution would be to apply the assignments in sequence.
In [18]: a = np.random.choice([-1,1],size=(10,))
In [19]: b = np.random.choice([-1,1],size=(10,))
In [20]: a
Out[20]: array([-1, 1, -1, -1, 1, -1, -1, 1, 1, -1])
In [21]: b
Out[21]: array([-1, 1, 1, 1, -1, 1, -1, 1, 1, 1])
Start off with an array with the 'default' value:
In [22]: c = np.zeros_like(a)
Apply the second condition:
In [23]: c[a<0] = 1
The third requires a little care since it combines 2 tests. () matter here:
In [25]: c[(a>0)&(b<0)] = 2
And the last:
In [26]: c[b>0] = 3
In [27]: c
Out[27]: array([1, 3, 3, 3, 2, 3, 1, 3, 3, 3])
Looks like all of the initial 0s are overwritten.
With many elements in the arrays, and just a few tests, I wouldn't worry about speed. Focus on clarity and expressiveness, not compactness.
There is a 3 argument version of where
that can choose between values or arrays. But I rarely use it, and don't see many questions about it either.
In [28]: c = np.where(a>0, 0, 1)
In [29]: c
Out[29]: array([1, 0, 1, 1, 0, 1, 1, 0, 0, 1])
In [30]: c = np.where((a>0)&(b<0), 2, c)
In [31]: c
Out[31]: array([1, 0, 1, 1, 2, 1, 1, 0, 0, 1])
In [32]: c = np.where(b>0, 3, c)
In [33]: c
Out[33]: array([1, 3, 3, 3, 2, 3, 1, 3, 3, 3])
These where
s could be chained on one line.
c = np.where(b>0, 3, np.where((a>0)&(b<0), 2, np.where(a>0, 0, 1)))
Substitute row in Numpy if a condition is met
Try values[values[:, 0] == z] = [x, y]
-- i.e., rows whose 1st entry is z
are set to [x, y]
.
import numpy as np
values = np.array([[10, 10],
[11, 10],
[12, 10],
[13, 10],
[14, 10]])
z = 13
x = 1
y = 0
values[values[:, 0] == z] = [x, y]
print(values)
Replacing elements in ndarray based on condition
Using logical indexing:
f_vars[f_vars < 2] += 1
will give you:
[[1 4]
[1 2]
[3 0]
[3 4]
[2 0]]
as expected. You can continue in the same manner for applying more conditionals. You might make use of np.logical_and
to achieve multiple conditions. Take care of the order you apply the conditions and if you find it confusing, an if-elif-else
statement would be the easiest. The np.nditer
indexing is done like this:
for x in np.nditer(f_vars,op_flags = ['readwrite']):
if x == -1:
continue
elif x < 2:
x[...] += 1
So, you have to set op_flags = ['readwrite']
and index through the i[...]
syntax.
How to replace values in numpy array at the same time
It's better to use np.select if you've multiple conditions:
a = np.array([7, 1, 2, 0, 2, 3, 4, 0, 5])
a = np.select([a == 7, a == 2], [2, 3], a)
OUTPUT:
[2 1 3 0 3 3 4 0 5]
Change values of a numpy array based on certain condition
It can be done using mask:
A[A < 0] += 5
The way it works is - the expression A < 0
returns a boolean array. Each cell corresponds to the predicate applied on the matching cell. In the current example:
A < 0 # [ True False True False True]
And then, the action is applied only on the cells that match the predicate. So in this example, it works only on the True
cells.
Numpy replacing elements based on logic and value in an identically shaped array
newMatrix = np.logical_and(matrix2 == 0, matrix1 > 5 )
This will iterate over all elements, and make an 'and' between pairs of booleans from matrix == 0
and matrix1 > 5
. Note that matrix1 > 5
type of expression generates a matrix of boolean values.
If you want 0,1 instead of False,True, you can add +0 to the result:
newMatrix = np.logical_and(matrix2 == 0, matrix1 > 5 ) + 0
Related Topics
Getting Segmentation Fault Core Dumped Error While Importing Robjects from Rpy2
What Is a "Good" Palette for Divergent Colors in R? (Or: Can Viridis and Magma Be Combined Together)
How Is the Feature Score(/Importance) in the Xgboost Package Calculated
Calling Custom Functions from Python Using Rpy2
Typeerror: Use() Got an Unexpected Keyword Argument 'Warn' When Importing Matplotlib
How to Set the R_Home Environment Variable to the R Home Directory
R Foverlaps Equivalent in Python
Error When Installing Rpy2 Module in Python with Easy_Install
Matplotlib Analog of R's 'Pairs'
Equivalent of a Python Dict in R
Closest Equivalent of a Factor Variable in Python Pandas
Comparison of R, Statmodels, Sklearn for a Classification Task with Logistic Regression
Typeerror: Objectid('') Is Not JSON Serializable
What's an Efficient Way to Find If a Point Lies in the Convex Hull of a Point Cloud
Efficient Calculation of Fibonacci Series