insert a row in pandas dataframe based on conditions
You can use this logic
import pandas as pd
df = pd.DataFrame({"count": ["yes", "yes", "yes", "yes", "yes"],
"A": [23, 23, 40, 40, 40]})
new_df = pd.DataFrame(columns=["count", "A"])
# df['shift'] = df['A'].shift() != df['A']
# df['cumsum'] = df['shift'].cumsum()
for k, v in df.groupby((df['A'].shift() != df['A']).cumsum()):
new_df = new_df.append(v[["count", "A"]], ignore_index=True)
new_df = new_df.append({"count": "result", "A": None}, ignore_index=True)
print(new_df)
Output:
count A
0 yes 23
1 yes 23
2 result None
3 yes 40
4 yes 40
5 yes 40
6 result None
How to add rows based on a condition with another dataframe
import pandas as pd
import numpy as np
Firstly convert 'date' column of payment dataframe into datetime dtype by using to_datetime()
method:
payments['date']=pd.to_datetime(payments['date'])
You can do this by using groupby()
method:
newdf=payments.groupby('agreement_id').agg({'payment':'sum','date':'min','cust_id':'first'}).reset_index()
Now by boolean masking get the data which mets your condition:
newdf=newdf[agreement['total_fee']==newdf['payment']].assign(payment=np.nan)
Note: here in the above code we are using assign()
method and making the payments row to NaN
Now make use of pd.tseries.offsets.Dateoffsets()
method and apply()
method:
newdf['date']=newdf['date']+agreement['term_months'].apply(lambda x:pd.tseries.offsets.DateOffset(months=x))
Note: The above code gives you a warning so just ignore that warning as it's a warning not an error
Finally make use of concat()
method and fillna()
method:
result=pd.concat((payments,newdf),ignore_index=True).fillna(0)
Now if you print result
you will get your desired output
#output
cust_id agreement_id date payment
0 1 A 2020-12-01 200.0
1 1 A 2021-02-02 200.0
2 1 A 2021-02-03 100.0
3 1 A 2021-05-01 200.0
4 1 B 2021-01-02 50.0
5 1 B 2021-01-09 20.0
6 1 B 2021-03-01 80.0
7 1 B 2021-04-23 90.0
8 2 C 2021-01-21 600.0
9 3 D 2021-03-04 150.0
10 3 D 2021-05-03 150.0
11 2 C 2021-07-21 0.0
12 3 D 2021-09-04 0.0
Note: If you want exact same output then make use of astype()
method and change payment column dtype from float
to int
result['payment']=result['payment'].astype(int)
Append row in dataframe if certain condition is met in another row of the dataframe
Try your logic here:
# all the location 1150
mask = df.LOCATION==1150
# divide by two
df.loc[mask, 'AMOUNT']/=2
# append those rows with new location value
df.append(df.loc[mask].assign(LOCATION=2051))
Output:
BILL_NO CREATED_DATE ACCT_NO LOCATION AMOUNT
0 100 4/6/2021 7551 1150 500.0
1 200 4/6/2021 7551 1101 500.0
2 300 4/6/2021 7551 2025 700.0
3 100 4/6/2021 7551 2051 500.0
pandas: append rows to another dataframe under the similar row based on column condition
First idea is filter df2
values by df1.col1
and append to df1
by concat
and then sorting by DataFrame.sort_values
:
df = pd.concat([df1, df2[(df2.col1.isin(df1.col1))]]).sort_values('col1', ignore_index=True)
print (df)
col1 col2 col3
0 I ate dinner min min
1 I ate dinner max max
2 I ate dinner min max
3 the play was inetresting mid max
4 the play was inetresting min max
5 the play was inetresting max mid
If need only common values in both DataFrames is possible filter by numpy.intersect1d
:
common = np.intersect1d(df1['col1'], df2['col1'])
df = (pd.concat([df1[df1.col1.isin(common)],
df2[df2.col1.isin(common)]])
.sort_values('col1', ignore_index=True))
print (df)
Related Topics
How to Select Variables in an R Dataframe Whose Names Contain a Particular String
Calculate the Area Under a Curve
Reshaping Data.Frame from Wide to Long Format
Combine a List of Data Frames into One Data Frame by Row
How to Disable Scientific Notation
Rcpp Package Doesn't Include Rcpp_Precious_Remove
How to Combine Multiple Conditions to Subset a Data-Frame Using "Or"
Create a Sequential Number (Counter) For Rows Within Each Group of a Dataframe
Combine Base and Ggplot Graphics in R Figure Window
Changing Column Names of a Data Frame
Mapping Columns/Rows from One Dataframe to Another Based on Row Number
Change R Default Library Path Using .Libpaths in Rprofile.Site Fails to Work
Why Are These Numbers Not Equal
Add Regression Line Equation and R^2 on Graph
Collapse Text by Group in Data Frame
Complete Dataframe With Missing Combinations of Values
Formula With Dynamic Number of Variables
Why Does Data.Table Update Names(Dt) by Reference, Even If I Assign to Another Variable