Python: Assign Labels to Values in an Array

Python: Assign Labels to values in an array

Note: OP's intervals are [-3,-1], [0,3] & (3,..), so I am only assuming integral values. The conditions can be altered accordingly, but the design remains.

Using list comprehensions for if-elif-else:

my_list = [-3,-2,-1,0,1,2,3,4,5,6]

my_list_mapping = ['F' if ((i >= -3) & (i <= -1)) else 'M' if ((i >= 0) & (i <= 3)) else 'S' for i in my_list]
print(my_list_mapping)
['F', 'F', 'F', 'M', 'M', 'M', 'M', 'S', 'S', 'S']

How to label arrays with labels from another array

Use np.stack instead of np.concatenate and everything should work:

train_array = np.stack((array_a,array_b,array_c,array_d,array_e,array_f,array_g,array_h), axis=0)
print(train_array.shape)
# (8, 300, 300)

How can I label the numpy array based on values?

There are many ways of doing this. Here are a few options:

In [1]: import numpy

In [2]: x = numpy.array([5,6,7,8,10,11,12,14])

In [3]: x
Out[3]: array([ 5, 6, 7, 8, 10, 11, 12, 14])

In [4]: x > 10
Out[4]: array([False, False, False, False, False, True, True, True], dtype=bool)

In [5]: ['Y' if y > 10 else 'N' for y in x]
Out[5]: ['N', 'N', 'N', 'N', 'N', 'Y', 'Y', 'Y']

In [6]: [{True: 'Y', False: 'N'}[y] for y in x > 10]
Out[6]: ['N', 'N', 'N', 'N', 'N', 'Y', 'Y', 'Y']

You could also use map or something of course :)

How to extract each item in an array and add as column name

so firstly, the list definition is incorrect. All items in a list need to be comma separated as follows:

l= [ 1, 8, 9, 10, 24, 25, 34, 40, 51, 72]

You can create the labels as follows:

 l= [ 1 , 8 , 9 ,10 ,24 ,25 ,34, 40, 51, 72]
# Make labels
labels = ['nar' + str(i) for i in l]
print(labels)

Note: You need to type cast (i) to str, since it is an integer it cannot be directly appended to a str.

Trying to assign labels based on the values on another column but getting always the same value

Change the for loop to an enumerated for loop and use iloc on your labels:

import pandas as pd

d = {"Percentage_delay" : [0.64, 0.80, 0.55, 0.48, 0.65, 0.46, 0.87, 0.66, 0.77, 0.44]}

number_delay_airport = pd.DataFrame(d)
# to use iloc you first have to create the column
number_delay_airport['Labels'] = ''
for j, i in enumerate(number_delay_airport['Percentage_delay']):
print(i,j)
if i >= 0 and i < 0.25:
number_delay_airport['Labels'].iloc[j] = 'low'
if i >= 0.25 and i < 0.75:
number_delay_airport['Labels'].iloc[j] = 'medium'
if i >= 0.75 and i <= 1:
number_delay_airport['Labels'].iloc[j] = 'high'

print(number_delay_airport)

Or even better, using the apply function you could do something like this:

import pandas as pd

d = {"Percentage_delay" : [0.64, 0.80, 0.55, 0.48, 0.65, 0.46, 0.87, 0.66, 0.77, 0.44]}

number_delay_airport = pd.DataFrame(d)

def assign_label(i):
if i >= 0 and i < 0.25:
return 'low'
if i >= 0.25 and i < 0.75:
return 'medium'
if i >= 0.75 and i <= 1:
return 'high'

number_delay_airport['Labels'] = number_delay_airport['Percentage_delay'].apply(assign_label)

print(number_delay_airport)

Assigning label per row through an if statement

You can use numpy.where combined with numpy.column_stack:

import numpy as np

arr = np.array([['A', 0.05],
['B', 0.09],
['C', 0.13]])

col = np.where(arr[:, 1].astype(np.float) > 0.10, '2', '1')
arr = np.column_stack((arr, col))
print(arr)

Output

[['A' '0.05' '1']
['B' '0.09' '1']
['C' '0.13' '2']]

UPDATE

If you have more than two labels, you could do something like this:

import numpy as np

arr = np.array([['A', 0.05],
['B', 0.09],
['C', 0.13]])

def calc(x):
if x < 0.08:
return '1'
elif 0.08 <= x < 0.10:
return '2'
elif 0.10 < x:
return '3'


col = np.array([calc(e) for e in arr[:, 1].astype(np.float)])
arr = np.column_stack((arr, col))
print(arr)

Output

[['A' '0.05' '1']
['B' '0.09' '2']
['C' '0.13' '3']]


Related Topics



Leave a reply



Submit