I want to multiply two columns in a pandas DataFrame and add the result into a new column
If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:
In [22]: orders_df['C'] = orders_df.Action.apply(
lambda x: (1 if x == 'Sell' else -1))
In [23]: orders_df # New column C represents the sign of the transaction
Out[23]:
Prices Amount Action C
0 3 57 Sell 1
1 89 42 Sell 1
2 45 70 Buy -1
3 6 43 Sell 1
4 60 47 Sell 1
5 19 16 Buy -1
6 56 89 Sell 1
7 3 28 Buy -1
8 56 69 Sell 1
9 90 49 Buy -1
Now we have eliminated the need for the if
statement. Using DataFrame.apply()
, we also do away with the for
loop. As Hayden noted, vectorized operations are always faster.
In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.C
In [25]: orders_df # The resulting dataframe
Out[25]:
Prices Amount Action C Value
0 3 57 Sell 1 171
1 89 42 Sell 1 3738
2 45 70 Buy -1 -3150
3 6 43 Sell 1 258
4 60 47 Sell 1 2820
5 19 16 Buy -1 -304
6 56 89 Sell 1 4984
7 3 28 Buy -1 -84
8 56 69 Sell 1 3864
9 90 49 Buy -1 -4410
This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.
multiply and sum two columns in two dataframes in Python
You will can just use pandas abstractions for it.
result = df['col1'] * df['col3']
If then you want to get the sum of those result values you can just do:
sum(results)
Python code to multiply two columns and then create new column with values
df[newcolumn] = df['current']*df['voltage']
will work.
You can name provide newcolumn as a variable.
def getPower(df, newColumn, numOfCol):
for i in range(numOfCol):
current = 'current#%d' % (i+1)
voltage = 'voltage#%d' % (i+1)
power = 'power#%d' % (i+1)
df[power] = df[current]*df[voltage]
getPower(df, 'Power', numOfCols) would create the column.
EDIT: This will work if you named your current columns like 'current1', current2',...
Column multiplication to replace original columns within panda dataframe
You can align indices by converting both ID
columns to indexes and then processing all columns:
df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")
print(df)
2018 2019 2020
ID
a1 0.6 1.2 1.8
b1 2.0 2.4 2.8
c1 NaN NaN NaN
df2 = pd.DataFrame({
'ID': ['a1', 'c1'],
'percentage': [0.6, 0.4]})
df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")
print(df)
2018 2019 2020
ID
a1 0.6 1.2 1.8
b1 NaN NaN NaN
c1 3.6 4.0 4.4
If only needing to multiply some columns:
cols = ['2018','2019']
df1 = df1.set_index('ID')
df1[cols] = df1[cols].multiply(df2.set_index('ID')["percentage"], axis="index")
print(df1)
2018 2019 2020
ID
a1 0.6 1.2 3
b1 NaN NaN 7
c1 3.6 4.0 11
HI why are you setting index in the last part of the answer of yours
Because not setting the index before multiplying produces incorrect output:
df2 = pd.DataFrame({
'ID': ['a1', 'c1'],
'percentage': [0.6, 0.4]})
cols = ['2018', '2019', '2020']
df1[cols] = df1[cols].mul(df2["percentage"], axis=0)
print (df1)
ID 2018 2019 2020
0 a1 0.6 1.2 1.8
1 b1 2.0 2.4 2.8 <- incorrect result (aligned on index 1 not b1)
2 c1 NaN NaN NaN <- incorrect result (aligned on index 2 not c1)
Multiply two columns in panda's dataframe and create a new column containing the solution
Try:
df['Result] = df['Carbon Footprint'] * df['Market Value']
Pandas Dataframe: Multiplying Two Columns
I believe that your ActualSalary
column is a mix of strings and integers. That is the only way I've been able to recreate your error:
df = pd.DataFrame(
{'ActualSalary': ['44600', '58,000.00', '70,000.00', 17550, 34693, 15674],
'FTE': [1, 1, 1, 1, 1, 0.4]})
>>> df['ActualSalary'].str.replace(',', '').astype(float) * df['FTE']
0 44600.0
1 58000.0
2 70000.0
3 NaN
4 NaN
5 NaN
dtype: float64
The issue arises when you try to remove the commas:
>>> df['ActualSalary'].str.replace(',', '')
0 44600
1 58000.00
2 70000.00
3 NaN
4 NaN
5 NaN
Name: ActualSalary, dtype: object
First convert them to strings, before converting back to floats.
fte_salary = (
df['ActualSalary'].astype(str).str.replace(',', '') # Remove commas in string, e.g. '55,000.00' -> '55000.00'
.astype(float) # Convert string column to floats.
.mul(df['FTE']) # Multiply by new salary column by Full-Time-Equivalent (FTE) column.
)
>>> df.assign(FTESalary=fte_salary) # Assign new column to dataframe.
ActualSalary FTE FTESalary
0 44600 1.0 44600.0
1 58,000.00 1.0 58000.0
2 70,000.00 1.0 70000.0
3 17550 1.0 17550.0
4 34693 1.0 34693.0
5 15674 0.4 6269.6
Related Topics
How to Convert .Dat to .Csv Using Python
Hiding Raw_Input() Password Input
How to Split an Integer into an Array of Digits
Tkinter Ttk Treeview How to Set Fixed Width Why It Change With Number of Column
How to Avoid Last Comma in Python Loop
Python: Pickle.Load() Raising Eoferror
Python: How to Calculate the Sum of Numbers from a File
How to Iterate Through Cur.Fetchall() in Python
Python-Compare Two String Columns in Same Dataframe, Return Matching Result
What Is the Most Efficient Way to Sum a Dict With Multiple Keys by One Key
Python Not Working in the Command Line of Git Bash
Iterating Over Every Two Elements in a List
Python | Make the Percentage of a List
Accuracy Score Valueerror: Can't Handle Mix of Binary and Continuous Target
How to Make a Discord Bot Leave a Server from a Command in Another Server