Replace Na with 0 in a Data Frame Column

How to replace NaN values by Zeroes in a column of a Pandas Dataframe?

I believe DataFrame.fillna() will do this for you.

Link to Docs for a dataframe and for a Series.

Example:

In [7]: df
Out[7]: 
          0         1
0       NaN       NaN
1 -0.494375  0.570994
2       NaN       NaN
3  1.876360 -0.229738
4       NaN       NaN

In [8]: df.fillna(0)
Out[8]: 
          0         1
0  0.000000  0.000000
1 -0.494375  0.570994
2  0.000000  0.000000
3  1.876360 -0.229738
4  0.000000  0.000000

To fill the NaNs in only one column, select just that column. in this case I'm using inplace=True to actually change the contents of df.

In [12]: df[1].fillna(0, inplace=True)
Out[12]: 
0    0.000000
1    0.570994
2    0.000000
3   -0.229738
4    0.000000
Name: 1

In [13]: df
Out[13]: 
          0         1
0       NaN  0.000000
1 -0.494375  0.570994
2       NaN  0.000000
3  1.876360 -0.229738
4       NaN  0.000000

EDIT:

To avoid a SettingWithCopyWarning, use the built in column-specific functionality:

df.fillna({1:0}, inplace=True)

Replace NA with 0 in a data frame column

Since nobody so far felt fit to point out why what you're trying doesn't work:

NA == NA doesn't return TRUE, it returns NA (since comparing to undefined values should yield an undefined result).
You're trying to call apply on an atomic vector. You can't use apply to loop over the elements in a column.
Your subscripts are off - you're trying to give two indices into a$x, which is just the column (an atomic vector).

I'd fix up 3. to get to a$x[is.na(a$x)] <- 0

Python Pandas replace multiple columns zero to Nan

I think you need replace by dict:

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})

How do I replace NA values with zeros in an R dataframe?

See my comment in @gsk3 answer. A simple example:

> m <- matrix(sample(c(NA, 1:10), 100, replace = TRUE), 10)
> d <- as.data.frame(m)
   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1   4  3 NA  3  7  6  6 10  6   5
2   9  8  9  5 10 NA  2  1  7   2
3   1  1  6  3  6 NA  1  4  1   6
4  NA  4 NA  7 10  2 NA  4  1   8
5   1  2  4 NA  2  6  2  6  7   4
6  NA  3 NA NA 10  2  1 10  8   4
7   4  4  9 10  9  8  9  4 10  NA
8   5  8  3  2  1  4  5  9  4   7
9   3  9 10  1  9  9 10  5  3   3
10  4  2  2  5 NA  9  7  2  5   5

> d[is.na(d)] <- 0

> d
   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1   4  3  0  3  7  6  6 10  6   5
2   9  8  9  5 10  0  2  1  7   2
3   1  1  6  3  6  0  1  4  1   6
4   0  4  0  7 10  2  0  4  1   8
5   1  2  4  0  2  6  2  6  7   4
6   0  3  0  0 10  2  1 10  8   4
7   4  4  9 10  9  8  9  4 10   0
8   5  8  3  2  1  4  5  9  4   7
9   3  9 10  1  9  9 10  5  3   3
10  4  2  2  5  0  9  7  2  5   5

There's no need to apply apply. =)

EDIT

You should also take a look at norm package. It has a lot of nice features for missing data analysis. =)

Replace NA in DataFrame for multiple columns with mean per country

Note Example is based on your additional data source from the comments

Replacing the NA-Values for multiple columns with mean() you can combine the following three methods:

fillna() (Iterating per column axis should be 0, which is default value of fillna())
groupby()
transform()

Create data frame from your example:

df = pd.read_excel('https://happiness-report.s3.amazonaws.com/2021/DataPanelWHR2021C2.xls')

Country name	year	Life Ladder	Log GDP per capita	Social support	Healthy life expectancy at birth	Freedom to make life choices	Generosity	Perceptions of corruption	Positive affect	Negative affect
Canada	2005	7.41805	10.6518	0.961552	71.3	0.957306	0.25623	0.502681	0.838544	0.233278
Canada	2007	7.48175	10.7392	nan	71.66	0.930341	0.249479	0.405608	0.871604	0.25681
Canada	2008	7.4856	10.7384	0.938707	71.84	0.926315	0.261585	0.369588	0.89022	0.202175
Canada	2009	7.48782	10.6972	0.942845	72.02	0.915058	0.246217	0.412622	0.867433	0.247633
Canada	2010	7.65035	10.7165	0.953765	72.2	0.933949	0.230451	0.41266	0.878868	0.233113

How to replace all NA values in numerical columns only with median values and update the dataframe

Based on your screenshots, it looks like you're just going back to the RStudio viewer window to look at the data frame again. If so, the issue is this:

When you write test2 %>% mutate_if(...), you're telling R to change something in test2 and return the result (roughly meaning, in this context, to just print the result and show it to you). What you're not telling it to do is to save that result anywhere.

You would want something like test2 <- test2 %>% mutate_if(...) to overwrite the existing test2 data frame in your global environment, or something like test3 <- test2 %>% mutate_if(...) to give it a new name and store the modified thing as a separate object while retaining the old one.

Lastly, I would echo Andrea M's concern that you might not want to do this at all. Imputing missing data with averages is, on a good day, risky.

Replacing NA with 0 in columns that contain a substring in the column name

Update

If it is to select columns having 'keyword' as substring in the column names, use contains to select across those columns

library(dplyr)
library(tidyr)
df1 <- df1 %>%
         mutate(across(contains('keyword'), replace_na, 0))

-output

df1
# A tibble: 5 × 4
   col1 col2_keyword col3   col4
  <int> <chr>        <chr> <dbl>
1     1 a            a         1
2     2 b            b         3
3     3 0            c        NA
4     4 c            d         5
5     5 d            <NA>      6

Assuming that the OP mentioned to replace NA only in columns that have a specific element 'keyword', use where with a logical expression to select the columns that have the 'keyword', loop across those columns and use replace_na to replace the NA to 0

df <- df %>%
    mutate(across(where(~ is.character(.x) && 'keyword' %in% .x), replace_na, 0))