Creating a new column based on if-elif-else condition
To formalize some of the approaches laid out above:
Create a function that operates on the rows of your dataframe like so:
def f(row):
if row['A'] == row['B']:
val = 0
elif row['A'] > row['B']:
val = 1
else:
val = -1
return val
Then apply it to your dataframe passing in the axis=1
option:
In [1]: df['C'] = df.apply(f, axis=1)
In [2]: df
Out[2]:
A B C
a 2 2 0
b 3 1 1
c 1 3 -1
Of course, this is not vectorized so performance may not be as good when scaled to a large number of records. Still, I think it is much more readable. Especially coming from a SAS background.
Edit
Here is the vectorized version
df['C'] = np.where(
df['A'] == df['B'], 0, np.where(
df['A'] > df['B'], 1, -1))
Creating a new column based on if-elif-else condition with np.where in python
You can use pandas.cut
to map your values.
NB. for clarity, I divided all values by 1 million.
# list of bins used to categorize the values
bins = [0, 50, 500, 2000, 4000, 6000, float('inf')]
# matching factors to map ]0-50] -> 1 ; ]50-500] -> 0.9, etc.
factors = [1, 0.9, 0.8, 0.5, 0.25, 0]
# get the factors and convert to float
f = pd.cut(df['Quantity Total Price'], bins=bins, labels=factors).astype(float)
# use the factors in numerical operation
df['Result'] = df['Rate']*f+df['Rate1']*(1-f)
output:
Quantity Total Price Rate Rate1 Result
0 20 15 14.5 15.000
1 100 15 14.5 14.950
2 700 15 14.5 14.900
3 3000 15 14.5 14.750
4 5000 15 14.5 14.625
5 7000 15 14.5 14.500
used input:
df = pd.DataFrame({'Quantity Total Price': [20, 100, 700, 3000, 5000, 7000],
'Rate': [15]*6, 'Rate1': [14.5]*6})
Error while creating a new column based on if-elif-else condition
In your data conditions
are not list of conditions, because is used boolean indexing
, so get list of DataFrames.
So for list of conditions remove df_sup[
and last ]
:
conditions = [
(df_sup['Date']>='2016-11-14') & (df_sup['Date']<'2016-11-21'),
(df_sup['Date']>='2016-11-21') & (df_sup['Date']<'2016-11-28'),
(df_sup['Date']>='2016-11-28') & (df_sup['Date']<'2016-12-05'),
(df_sup['Date']>='2016-12-05') & (df_sup['Date']<'2016-12-12'),
(df_sup['Date']>='2016-12-12') & (df_sup['Date']<'2016-12-19')
]
Create new column in pandas dataframe based on if/elif/and functions
The issue maybe due to the way you are using it. I don't know if it will help you. but I have re written the code as per my knowledge that is working.
import pandas as pd
rate_2006, rate_2007 = 100, 200
c = {
'region': ["a", "a", "a", "a", "a", "b", "b", "b", "b", "a", "b"],
'year': [2006, 2007, 2007, 2006, 2006, 2006, 2007, 2007, 2007, 2006, 2007],
'sales': [500, 100, 2990, 15, 5000, 2000, 150, 300, 250, 1005, 600]
}
df1 = pd.DataFrame(c)
print(df1)
def new_col(value):
if df1.loc[value,"region"] == "a" and df1.loc[value,"year"] == 2006:
df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2006
elif df1.loc[value,"region"] == "a" and df1.loc[value,"year"] == 2007:
df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2007
elif df1.loc[value,"region"] == "b" and df1.loc[value,"year"] == 2006:
df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2006
else:
df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2007
for value in range(len(df1)):
new_col(value)
Creating New Column based on condition on Other Column in Pandas DataFrame
import pandas as pd
# initialize list of lists
data = [[1,'High School',7.884], [2,'Bachelors',6.952], [3,'High School',8.185], [4,'High School',6.556],[5,'Bachelors',6.347],[6,'Master',6.794]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['ID', 'Education', 'Score'])
df['Labels'] = ['Bad' if x<7.000 else 'Good' if 7.000<=x<8.000 else 'Very Good' for x in df['Score']]
df
ID Education Score Labels
0 1 High School 7.884 Good
1 2 Bachelors 6.952 Bad
2 3 High School 8.185 Very Good
3 4 High School 6.556 Bad
4 5 Bachelors 6.347 Bad
5 6 Master 6.794 Bad
Related Topics
Unicodedecodeerror: 'Ascii' Codec Can't Decode Byte 0Xef in Position 1
What Does It Mean to "Call" a Function in Python
How to Get Indices of a Sorted Array in Python
Pylint "Unable to Import" Error - How to Set Pythonpath
Matplotlib: Save Plot to Numpy Array
Python MySQLdb: Library Not Loaded: Libmysqlclient.18.Dylib
Adding a Module (Specifically Pymorph) to Spyder (Python Ide)
How to Plot Multiple Seaborn Jointplot in Subplot
Argument 1 Has Unexpected Type 'Nonetype'
Django Passing Custom Form Parameters to Formset
Df.Append() Is Not Appending to the Dataframe
Does Python Urllib2 Automatically Uncompress Gzip Data Fetched from Webpage