Subtract one dataframe from another excluding the first column Pandas
Simply subtract the entire DataFrames, then reassign the desired values to the Wavelength column.
result = df_tot - df_nap
result['Wavelength'] = df_tot['Wavelength']
For example,
import numpy as np
import pandas as pd
df_tot = pd.DataFrame(np.random.randint(10, size=(3,4)), columns=list('ABCD'))
df_nap = pd.DataFrame(np.random.randint(10, size=(3,4)), columns=list('ABCD'))
# df_tot['A'] = df_nap['A'] # using column A as the "Wavelength" column
result = df_tot - df_nap
result['A'] = df_tot['A']
Alternatively, or if Wavelength column were not numeric, you could
subtract everything except the Wavelength, then reassign that column:
result = df_tot.drop('Wavelength', axis=1) - df_nap.drop('Wavelength', axis=1)
result['Wavelength'] = df_tot['Wavelength']
How to difference of two dataframes except one column in pandas?
I believe need extract columns names by difference
and then use DataFrame.sub
:
cols = df1.columns.difference(['wave'])
#is possible specify multiple columns
#cols = df1.columns.difference(['wave','MeasredWave'])
#df1[cols] = means in output are not touch columns from df1
df1[cols] = df1[cols].sub(df2[cols])
print (df1)
wave num stlines fwhm EWs MeasredWave
0 4050.32 0.0 0.006340 -0.00412 -0.715440 0.000891
1 4208.98 0.0 -0.002455 0.00086 0.227375 -0.000851
2 4374.94 3.0 -0.001930 0.00274 1.853900 -0.000539
3 4379.74 8.0 0.014970 -0.00733 -0.882590 0.001262
4 4398.01 8.0 -0.004660 0.01392 7.048620 0.003309
5 4502.21 -1.0 -0.009540 0.00055 -0.692110 -0.000544
cols = df1.columns.difference(['wave'])
#df2[cols] = means in output are not touch columns from df2
df2[cols] = df1[cols].sub(df2[cols])
print (df2)
wave num stlines fwhm EWs MeasredWave
0 4050.32 0.0 0.006340 -0.00412 -0.715440 0.000891
1 4208.98 0.0 -0.002455 0.00086 0.227375 -0.000851
2 4374.94 3.0 -0.001930 0.00274 1.853900 -0.000539
3 4379.74 8.0 0.014970 -0.00733 -0.882590 0.001262
4 4398.01 8.0 -0.004660 0.01392 7.048620 0.003309
5 4502.21 -1.0 -0.009540 0.00055 -0.692110 -0.000544
How to Subtract one column in pandas from another?
It looks like you want to create new rows. You can index the dataframe by Account
which also has the advantage that the remaining columns are the things you want to subtract. Then subtract and add a new row.
>>> df = pd.DataFrame({'Accounts':['Cash','Build','Build Dep', 'Car', 'Car Dep'],
... 'Debits':[300,500,0,100,0],
... 'Credits':[0,0,250,0,50]})
>>>
>>> df = df.set_index('Accounts')
>>> df.loc['Build Delta'] = df.loc['Build Dep'] - df.loc['Build']
>>> df.loc['Car Delta'] = df.loc['Car'] - df.loc['Car Dep']
>>>
>>> print(df)
Debits Credits
Accounts
Cash 300 0
Build 500 0
Build Dep 0 250
Car 100 0
Car Dep 0 50
Build Delta -500 250
Car Delta 100 -50
If you want to have a column of deltas for all of the rows, just subtract the columns. This is the beauty of numpy and pandas. You can apply operations to entire columns with small amounts of code and get better performance than if you did it in vanilla python.
>>> df = pd.DataFrame({'Accounts':['Cash','Build','Build Dep', 'Car', 'Car Dep'],
... 'Debits':[300,500,0,100,0],
... 'Credits':[0,0,250,0,50]})
>>>
>>> df = df.set_index('Accounts')
>>>
>>>
>>>
>>> df['Delta'] = df['Credits'] - df['Debits']
>>> df
Debits Credits Delta
Accounts
Cash 300 0 -300
Build 500 0 -500
Build Dep 0 250 250
Car 100 0 -100
Car Dep 0 50 50
Subtract pandas dataframes while leaving some columns intact
You can use:
df1.set_index(['idx','stat'], inplace=True)
df2.set_index('idx', inplace=True)
print (df1.sub(df2[['val']]))
val
idx stat
1 1 NaN
2 1 2.0
3 2 NaN
4 3 4.0
print (df1.sub(df2[['val']]).reset_index())
idx stat val
0 1 1 NaN
1 2 1 2.0
2 3 2 NaN
3 4 3 4.0
If idx
are indexes in both df
:
print (df1)
stat val
idx
1 1 5
2 1 10
3 2 15
4 3 20
print (df2)
stat val
idx
2 1 8
4 5 16
df1.set_index('stat', append=True, inplace=True)
print (df1.sub(df2[['val']]).reset_index())
idx stat val
0 1 1 NaN
1 2 1 2.0
2 3 2 NaN
3 4 3 4.0
Subtract a dataframe with some matching and non matching columns and indexes
Use pd.DataFrame.sub
with fill_value
, then fillna
for missing values in df_add dataframe:
df_add.sub(df_sub, fill_value=0).fillna(0)
Output:
1 2 3 4
A 1.1 1.2 1.3 1.4
B 2.1 -2.8 2.3 -5.6
C 0.0 -6.0 0.0 -9.0
D 3.1 -3.8 3.3 -6.6
E 4.1 4.2 4.3 4.4
How to subtract columns of one dataframe to that of another in python and store the result and both columns in a new dataframe
Yes you can merge on column "X" to create a single dataframe with all values, then iterate through the columns and append calculated dataframes to a list.
Something like:
df = pd.merge(left=df1, right=df2, on='X', how='inner')
output_frames = []
letters = [i[-1] for i in df.columns if not i == 'X'] #Get letters in your columns
letters = list(set(letters)) #Get list of letters with single element
for letter in letters:
newdf = df.loc[:, ['X'] + [i for i in df.columns if i.endswith(letter)]]
newdf['Difference'] = newdf[f'par {letter}'] - newdf[f'bar {letter}']
newdf['Status'] = newdf.apply(lambda x: 'OK' if x.Difference >= 0 else 'NOT OK', axis=1)
output_frames.append(newdf)
Then view output:
for df in newdf:
print(df)
Or better still, merge the outputs:
df = pd.concat(output_frames, axis=1)
R: Subtracting two dataframes except some of the columns
If you want to keep Variables as Year, Quarter, and Group you can do:
cbind(BMH[ ,1:3], BMH[,-c(1:3)] - BML[,-c(1:3)])
and save convert it again to a data frame with as.data.frame()
Related Topics
How to Create a for Loop That Goes Through All Diagonal Possibilities of a List
Only Reading First N Rows of CSV File With CSV Reader in Python
Swapping Columns in a Numpy Array
How to Split an Integer into an Array of Digits
Convert String to Python Class Object
How to Split an Array According to Conditional Statement
Jupyter Notebook Python Nameerror
Change Specific Value in CSV File Via Python
Find Records With Leading Zero in Python Pandas
How to Save Plotly Offline Graph in Format Png
Python: How to Read and Load an Excel File from Aws S3
How to Convert Column With String Type to Int Form in Pyspark Data Frame
Converting Exponential to Float
Pandas: Update Column Values from Another Column If Criteria
Python Pandas: Nameerror: Name Is Not Defined
How to Change Dd-Mm-Yyyy Date Format to Yyyy-Dd-Mm in Pandas
How to Write a Python Script That Can Read Doc/Docx Files and Convert Them to Txt
Set Working Directory in Python/Spyder So That It's Reproducible