Replace Missing Values With Column Mean

Replace missing values with column mean

A relatively simple modification of your code should solve the issue:

for(i in 1:ncol(data)){
  data[is.na(data[,i]), i] <- mean(data[,i], na.rm = TRUE)
}

Replace missing values with the mean of each variable Python

The last few columns are strings, not floats.

Try converting to float before taking the mean:

# Make Sure Everything is numeric
num_df = num_df.apply(pd.to_numeric, errors='coerce')
# Take Mean
num_df = num_df.fillna(num_df.mean())

print(num_df)

   id    sod  pot  hemo  pcv    wc    rc
0   0  111.0  2.5  15.4   44  7800  5.20
1   1  111.0  2.5  11.3   38  6000  4.55
2   2  111.0  2.5   9.6   31  7500  4.55
3   3  111.0  2.5  11.2   32  6700  3.90

pandas DataFrame: replace nan values with average of columns

You can simply use DataFrame.fillna to fill the nan's directly:

In [27]: df 
Out[27]: 
          A         B         C
0 -0.166919  0.979728 -0.632955
1 -0.297953 -0.912674 -1.365463
2 -0.120211 -0.540679 -0.680481
3       NaN -2.027325  1.533582
4       NaN       NaN  0.461821
5 -0.788073       NaN       NaN
6 -0.916080 -0.612343       NaN
7 -0.887858  1.033826       NaN
8  1.948430  1.025011 -2.982224
9  0.019698 -0.795876 -0.046431

In [28]: df.mean()
Out[28]: 
A   -0.151121
B   -0.231291
C   -0.530307
dtype: float64

In [29]: df.fillna(df.mean())
Out[29]: 
          A         B         C
0 -0.166919  0.979728 -0.632955
1 -0.297953 -0.912674 -1.365463
2 -0.120211 -0.540679 -0.680481
3 -0.151121 -2.027325  1.533582
4 -0.151121 -0.231291  0.461821
5 -0.788073 -0.231291 -0.530307
6 -0.916080 -0.612343 -0.530307
7 -0.887858  1.033826 -0.530307
8  1.948430  1.025011 -2.982224
9  0.019698 -0.795876 -0.046431

The docstring of fillna says that value should be a scalar or a dict, however, it seems to work with a Series as well. If you want to pass a dict, you could use df.mean().to_dict().

Replace missing value with mean of class within column

Using dplyr, you could group_by Class and apply NA2mean for every column.

library(dplyr)
DF %>% group_by(class) %>% mutate_all(NA2mean)

In the newer version of dplyr, you can do this across

DF %>% group_by(class) %>% mutate(across(everything(), NA2mean))

Replace value with the average of it's column with Pandas

The first thing to recognize is the columns that have 'x' in them are not integer datatypes. They are object datatypes.

df = pd.read_csv('file.csv')

df

    Col1    Col2
0   1   22
1   2   44
2   3   x
3   4   88
4   5   110
5   6   132
6   7   x
7   8   176
8   9   198
9   10  x

df.dtypes

Col1     int64
Col2    object
dtype: object

In order to get the mean of Col2, it needs to be converted to a numeric value.

df['Col2'] = pd.to_numeric(df['Col2'], errors='coerce').astype('Int64')

df.dtypes
Col1    int64
Col2    Int64
dtype: object

The df now looks like so:

df 

Col1    Col2
0   1   22
1   2   44
2   3   <NA>
3   4   88
4   5   110
5   6   132
6   7   <NA>
7   8   176
8   9   198
9   10  <NA>

Now we can use fillna() with df['Col2'].mean():

df['Col2'] = df['Col2'].fillna(df['Col2'].mean())

df
    Col1    Col2
0   1   22
1   2   44
2   3   110
3   4   88
4   5   110
5   6   132
6   7   110
7   8   176
8   9   198
9   10  110

Replace NA in DataFrame for multiple columns with mean per country

Note Example is based on your additional data source from the comments

Replacing the NA-Values for multiple columns with mean() you can combine the following three methods:

fillna() (Iterating per column axis should be 0, which is default value of fillna())
groupby()
transform()

Create data frame from your example:

df = pd.read_excel('https://happiness-report.s3.amazonaws.com/2021/DataPanelWHR2021C2.xls')

Country name	year	Life Ladder	Log GDP per capita	Social support	Healthy life expectancy at birth	Freedom to make life choices	Generosity	Perceptions of corruption	Positive affect	Negative affect
Canada	2005	7.41805	10.6518	0.961552	71.3	0.957306	0.25623	0.502681	0.838544	0.233278
Canada	2007	7.48175	10.7392	nan	71.66	0.930341	0.249479	0.405608	0.871604	0.25681
Canada	2008	7.4856	10.7384	0.938707	71.84	0.926315	0.261585	0.369588	0.89022	0.202175
Canada	2009	7.48782	10.6972	0.942845	72.02	0.915058	0.246217	0.412622	0.867433	0.247633
Canada	2010	7.65035	10.7165	0.953765	72.2	0.933949	0.230451	0.41266	0.878868	0.233113

How do I replace all NA with mean in R?

We can use na.aggregate from zoo. Loop through the columns of dataset (assuming all the columns are numeric ), apply the na.aggregate to replace the NA with mean values (by default) and assign it back to the dataset.

library(zoo)
df[] <- lapply(df, na.aggregate)

By default, the FUN argument of na.aggregate is mean:

Default S3 method:

na.aggregate(object, by = 1, ..., FUN = mean,
na.rm = FALSE, maxgap = Inf)

To do this nondestructively:

df2 <- df
df2[] <- lapply(df2, na.aggregate)

or in one line:

df2 <- replace(df, TRUE, lapply(df, na.aggregate))

If there are non-numeric columns, do this only for the numeric columns by creating a logical index first

ok <- sapply(df, is.numeric)
df[ok] <- lapply(df[ok], na.aggregate)

R: Replace all values in column (NA and values) with mean of values

We can just subset the non-NA elements to replace it

library(dplyr)
df %>%
     group_by(Day, Plate) %>%
     mutate(Value = mean(Value[!is.na(Value)]))

Or use the na.rm in mean

df %>%
     group_by(Day, Plate) %>%
     mutate(Value = mean(Value, na.rm = TRUE))

Replace Missing Values With Column Mean