How to fill empty cell value in pandas with condition
It looks like you want to fill forward where there is missing data.
You can do this with 'fillna', which is available on pd.DataFrame objects.
In your case, you only want to fill forward for each item, so first group by item, and then use fillna. The method 'pad' just carries forward in order (hence why we sort first).
df['final_sales'] = df.sort_values('Year').groupby('Item')['final_sales'].fillna(method='pad')
Note that on your example data, A3 is missing for 2016 as well, so there is nothing to carry forward and it remains missing for 2017.
Pandas(Python) : How to fill empty cells with previous row value with conditions
Try:
df=pd.DataFrame({'order id':['A1','A1','A2','A3'],'Name':['Adam',np.nan,np.nan,'su']})
df['Name']=df['Name'].groupby(df['order id']).fillna(method='ffill')
df
Out[28]:
order id Name
0 A1 Adam
1 A1 Adam
2 A2 NaN
3 A3 su
How to fill dataframe's empty/nan cell with conditional column mean
Although the numbers used as averages are different, we have presented two types of averages: the normal average and the average calculated on the number of cases that include NaN.
df['Revenue'] = df['Revenue'].replace({'\$':'', ',':''}, regex=True)
df['Revenue'] = df['Revenue'].astype(float)
df_mean = df.groupby(['Industry'], as_index = False)['Revenue'].mean()
df_mean
Industry Revenue
0 Construction 4.358071e+06
1 Financial Services 8.858420e+06
2 IT Services 1.175702e+07
df_mean_nan = df.groupby(['Industry'], as_index = False)['Revenue'].agg({'Sum':np.sum, 'Size':np.size})
df_mean_nan['Mean_nan'] = df_mean_nan['Sum'] / df_mean_nan['Size']
df_mean_nan
Industry Sum Size Mean_nan
0 Construction 13074212.0 5.0 2614842.4
1 Financial Services 17716840.0 2.0 8858420.0
2 IT Services 11757018.0 1.0 11757018.0
Average taking into account the number of NaNs
df.loc[df['Revenue'].isna(),['Revenue']] = df_mean_nan.loc[df_mean_nan['Industry'] == 'Construction',['Mean_nan']].values
df
ID Name Industry Year Revenue
0 1 Treslam Financial Services 2009 5387469.0
1 2 Rednimdox Construction 2013 2614842.4
2 3 Lamtone IT Services 2009 11757018.0
3 4 Stripfind Financial Services 2010 12329371.0
4 5 Openjocon Construction 2013 4273207.0
5 6 Villadox Construction 2012 1097353.0
6 7 Sumzoomit Construction 2010 7703652.0
7 8 Abcddd Construction 2019 2614842.4
Normal average: (NaN is excluded)
df.loc[df['Revenue'].isna(),['Revenue']] = df_mean.loc[df_mean['Industry'] == 'Construction',['Revenue']].values
df
ID Name Industry Year Revenue
0 1 Treslam Financial Services 2009 5.387469e+06
1 2 Rednimdox Construction 2013 4.358071e+06
2 3 Lamtone IT Services 2009 1.175702e+07
3 4 Stripfind Financial Services 2010 1.232937e+07
4 5 Openjocon Construction 2013 4.273207e+06
5 6 Villadox Construction 2012 1.097353e+06
6 7 Sumzoomit Construction 2010 7.703652e+06
7 8 Abcddd Construction 2019 4.358071e+06
fill blanks in a column based on conditions pandas
Here is one way.Let's call your dataframe df
. First work is on the case where cells ends by A or N.
# create the mask when finisig by A or N
mask_AN = (df['cells'].str[-1] == 'A') | (df['cells'].str[-1] == 'N')
# create the column final_value and write
# 1 if the value should be from the column npv and
#2 if the value should be from the column scpci
df.loc[mask_AN,'final_value'] = pd.np.where((df.loc[mask_AN,'scpci']%3 == 0)
& (df.loc[mask_AN,'npv']%3 != 0),2,1)
The np.where
works as: the final_value should be from scpci (so 2 for now) only if the column scpci is divisible by 3 while the column npv is not, otherwise the final_value will be from npv (so 1).
The next step is to fill final_value for a same site with the value where cells ends by A or N. This can be done by:
df['final_value'] = df.groupby('Site')['final_value'].ffill() # fill forward
Note that the filling works here as it seems that you have a cell ending by 'A' before one endings by 'B' or 'C' (except when unique) and same a cell ending by 'N' before one endings with 'O' and 'P'. This ffill
might not work if you don't always have this configuration.
Finally, you need to do the site with unique row:
# mask of site with unique cell
df_g = df.reset_index().groupby('Site')
mask_unique = df_g.index.first()[df_g.cells.count() ==1]
# a bit on the same idea than before for adding 1 or 2 in the final_value column
df['final_value'].loc[mask_unique] = pd.np.where((df['scpci'].loc[mask_unique]%3 == 0)
& (df['npv'].loc[mask_unique]%3 != 0),2,1)
Now that you have 1 or 2 in this final_value column, just need to replace by the value in the associated column:
df['final_value'] = pd.np.where( df['final_value'] == 1, df['npv'], df['scpci'])
The output is like expected:
Site cells Azimut technology npv scpci final_value
0 T30264 G30264B 130 UMTS900 343 276 276
1 T30992 G30992A 10 UMTS900 171 12 171
2 T30992 G30992B 260 UMTS900 173 13 173
3 T30992 U30992A 10 UMTS2100 171 12 171
4 T30992 U30992B 260 UMTS2100 173 13 173
5 T31520 G31520A 0 UMTS900 72 500 72
6 T31520 G31520B 120 UMTS900 73 501 73
7 T31520 G31520C 220 UMTS900 74 502 74
8 T31548 G31548A 30 UMTS900 93 450 93
9 T31548 G31548B 130 UMTS900 94 451 94
10 T31548 G31548C 250 UMTS900 95 452 95
11 T31548 U31548N 30 UMTS2100 94 450 450
12 T31548 U31548O 130 UMTS2100 95 451 451
13 T31548 U31548P 250 UMTS2100 96 452 452
Fill blank cells with another column value in Python
replace()
the empty strings with nan
and then chain a couple fillna()
:
df.C = df.C.replace(r'^\s*$', np.nan, regex=True).fillna(df.A).fillna(df.B)
# A B C
# 0 xyz NaN 12.03.2010
# 1 abc NaN 01.10.2009
# 2 NaN 14.11.2010 14.11.2010
# 3 02.10.2010 NaN 02.10.2010
Alternatively start with str.strip()
to make the replacement simpler:
df.C = df.C.str.strip().replace('', np.nan).fillna(df.A).fillna(df.B)
Replace blank value in dataframe based on another column condition
Hi I have used the below code and it worked
b = [52]
df.Item=np.where(df.Department.isin(b),df.Item.fillna(2515),df.Item)
a = [7]
df.Item=np.where(df.Department.isin(a),df.Item.fillna(45),df.Item)
Hope it helps someone who face the same issue
Related Topics
Check Json Data Is None in Python
How to Get Maximum Length of Each Column in the Data Frame Using Pandas Python
How to Map True/False to 1/0 in a Pandas Dataframe
How to Check Whether a Number Is Divisible by Another Number
How to Delete Tkinter Widgets from a Window
How to Install Pip for a Specific Python Version
Converting a List into Comma Separated and Add Quotes in Python
Printing the Number of Days in a Given Month and Year [Python]
Easiest Way to Ignore Blank Lines When Reading a File in Python
How to Start a Background Process in Python
How to Transfer Data from One Worksheet into Another Using Python in the Same Workbook
How to Convert Number 1 to a Boolean in Python
Python Tkinter Return Value from Function Used in Command
How to Constantly Run Python Script in the Background on Windows
Comparing Items in Lists Within Same Indices Python