I Want to Multiply Two Columns in a Pandas Dataframe and Add the Result into a New Column

I want to multiply two columns in a pandas DataFrame and add the result into a new column

If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:

In [22]: orders_df['C'] = orders_df.Action.apply(
lambda x: (1 if x == 'Sell' else -1))

In [23]: orders_df # New column C represents the sign of the transaction
Out[23]:
Prices Amount Action C
0 3 57 Sell 1
1 89 42 Sell 1
2 45 70 Buy -1
3 6 43 Sell 1
4 60 47 Sell 1
5 19 16 Buy -1
6 56 89 Sell 1
7 3 28 Buy -1
8 56 69 Sell 1
9 90 49 Buy -1

Now we have eliminated the need for the if statement. Using DataFrame.apply(), we also do away with the for loop. As Hayden noted, vectorized operations are always faster.

In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.C

In [25]: orders_df # The resulting dataframe
Out[25]:
Prices Amount Action C Value
0 3 57 Sell 1 171
1 89 42 Sell 1 3738
2 45 70 Buy -1 -3150
3 6 43 Sell 1 258
4 60 47 Sell 1 2820
5 19 16 Buy -1 -304
6 56 89 Sell 1 4984
7 3 28 Buy -1 -84
8 56 69 Sell 1 3864
9 90 49 Buy -1 -4410

This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.

multiply and sum two columns in two dataframes in Python

You will can just use pandas abstractions for it.

result = df['col1'] * df['col3']

If then you want to get the sum of those result values you can just do:

sum(results)

Python code to multiply two columns and then create new column with values

df[newcolumn] = df['current']*df['voltage']

will work.

You can name provide newcolumn as a variable.

def getPower(df, newColumn, numOfCol):
for i in range(numOfCol):
current = 'current#%d' % (i+1)
voltage = 'voltage#%d' % (i+1)
power = 'power#%d' % (i+1)
df[power] = df[current]*df[voltage]

getPower(df, 'Power', numOfCols) would create the column.

EDIT: This will work if you named your current columns like 'current1', current2',...

Column multiplication to replace original columns within panda dataframe

You can align indices by converting both ID columns to indexes and then processing all columns:

df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")

print(df)
2018 2019 2020
ID
a1 0.6 1.2 1.8
b1 2.0 2.4 2.8
c1 NaN NaN NaN


df2 = pd.DataFrame({
'ID': ['a1', 'c1'],
'percentage': [0.6, 0.4]})

df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")

print(df)
2018 2019 2020
ID
a1 0.6 1.2 1.8
b1 NaN NaN NaN
c1 3.6 4.0 4.4

If only needing to multiply some columns:

cols = ['2018','2019']
df1 = df1.set_index('ID')
df1[cols] = df1[cols].multiply(df2.set_index('ID')["percentage"], axis="index")

print(df1)

2018 2019 2020
ID
a1 0.6 1.2 3
b1 NaN NaN 7
c1 3.6 4.0 11


HI why are you setting index in the last part of the answer of yours

Because not setting the index before multiplying produces incorrect output:

df2 = pd.DataFrame({
'ID': ['a1', 'c1'],
'percentage': [0.6, 0.4]})

cols = ['2018', '2019', '2020']

df1[cols] = df1[cols].mul(df2["percentage"], axis=0)
print (df1)
ID 2018 2019 2020
0 a1 0.6 1.2 1.8
1 b1 2.0 2.4 2.8 <- incorrect result (aligned on index 1 not b1)
2 c1 NaN NaN NaN <- incorrect result (aligned on index 2 not c1)

Multiply two columns in panda's dataframe and create a new column containing the solution

Try:

df['Result] = df['Carbon Footprint'] * df['Market Value'] 

Pandas Dataframe: Multiplying Two Columns

I believe that your ActualSalary column is a mix of strings and integers. That is the only way I've been able to recreate your error:

df = pd.DataFrame(
{'ActualSalary': ['44600', '58,000.00', '70,000.00', 17550, 34693, 15674],
'FTE': [1, 1, 1, 1, 1, 0.4]})

>>> df['ActualSalary'].str.replace(',', '').astype(float) * df['FTE']
0 44600.0
1 58000.0
2 70000.0
3 NaN
4 NaN
5 NaN
dtype: float64

The issue arises when you try to remove the commas:

>>> df['ActualSalary'].str.replace(',', '')
0 44600
1 58000.00
2 70000.00
3 NaN
4 NaN
5 NaN
Name: ActualSalary, dtype: object

First convert them to strings, before converting back to floats.

fte_salary = (
df['ActualSalary'].astype(str).str.replace(',', '') # Remove commas in string, e.g. '55,000.00' -> '55000.00'
.astype(float) # Convert string column to floats.
.mul(df['FTE']) # Multiply by new salary column by Full-Time-Equivalent (FTE) column.
)
>>> df.assign(FTESalary=fte_salary) # Assign new column to dataframe.
ActualSalary FTE FTESalary
0 44600 1.0 44600.0
1 58,000.00 1.0 58000.0
2 70,000.00 1.0 70000.0
3 17550 1.0 17550.0
4 34693 1.0 34693.0
5 15674 0.4 6269.6


Related Topics



Leave a reply



Submit