Replace Na with 0 in a Data Frame Column

How to replace NaN values by Zeroes in a column of a Pandas Dataframe?

I believe DataFrame.fillna() will do this for you.

Link to Docs for a dataframe and for a Series.

Example:

In [7]: df
Out[7]:
0 1
0 NaN NaN
1 -0.494375 0.570994
2 NaN NaN
3 1.876360 -0.229738
4 NaN NaN

In [8]: df.fillna(0)
Out[8]:
0 1
0 0.000000 0.000000
1 -0.494375 0.570994
2 0.000000 0.000000
3 1.876360 -0.229738
4 0.000000 0.000000

To fill the NaNs in only one column, select just that column. in this case I'm using inplace=True to actually change the contents of df.

In [12]: df[1].fillna(0, inplace=True)
Out[12]:
0 0.000000
1 0.570994
2 0.000000
3 -0.229738
4 0.000000
Name: 1

In [13]: df
Out[13]:
0 1
0 NaN 0.000000
1 -0.494375 0.570994
2 NaN 0.000000
3 1.876360 -0.229738
4 NaN 0.000000

EDIT:

To avoid a SettingWithCopyWarning, use the built in column-specific functionality:

df.fillna({1:0}, inplace=True)

Replace NA with 0 in a data frame column

Since nobody so far felt fit to point out why what you're trying doesn't work:

  1. NA == NA doesn't return TRUE, it returns NA (since comparing to undefined values should yield an undefined result).
  2. You're trying to call apply on an atomic vector. You can't use apply to loop over the elements in a column.
  3. Your subscripts are off - you're trying to give two indices into a$x, which is just the column (an atomic vector).

I'd fix up 3. to get to a$x[is.na(a$x)] <- 0

Python Pandas replace multiple columns zero to Nan

I think you need replace by dict:

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})

How do I replace NA values with zeros in an R dataframe?

See my comment in @gsk3 answer. A simple example:

> m <- matrix(sample(c(NA, 1:10), 100, replace = TRUE), 10)
> d <- as.data.frame(m)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 4 3 NA 3 7 6 6 10 6 5
2 9 8 9 5 10 NA 2 1 7 2
3 1 1 6 3 6 NA 1 4 1 6
4 NA 4 NA 7 10 2 NA 4 1 8
5 1 2 4 NA 2 6 2 6 7 4
6 NA 3 NA NA 10 2 1 10 8 4
7 4 4 9 10 9 8 9 4 10 NA
8 5 8 3 2 1 4 5 9 4 7
9 3 9 10 1 9 9 10 5 3 3
10 4 2 2 5 NA 9 7 2 5 5

> d[is.na(d)] <- 0

> d
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 4 3 0 3 7 6 6 10 6 5
2 9 8 9 5 10 0 2 1 7 2
3 1 1 6 3 6 0 1 4 1 6
4 0 4 0 7 10 2 0 4 1 8
5 1 2 4 0 2 6 2 6 7 4
6 0 3 0 0 10 2 1 10 8 4
7 4 4 9 10 9 8 9 4 10 0
8 5 8 3 2 1 4 5 9 4 7
9 3 9 10 1 9 9 10 5 3 3
10 4 2 2 5 0 9 7 2 5 5

There's no need to apply apply. =)

EDIT

You should also take a look at norm package. It has a lot of nice features for missing data analysis. =)

Replace NA in DataFrame for multiple columns with mean per country

Note Example is based on your additional data source from the comments

Replacing the NA-Values for multiple columns with mean() you can combine the following three methods:

  • fillna() (Iterating per column axis should be 0, which is default value of fillna())
  • groupby()
  • transform()


Create data frame from your example:

df = pd.read_excel('https://happiness-report.s3.amazonaws.com/2021/DataPanelWHR2021C2.xls')




















































































Country nameyearLife LadderLog GDP per capitaSocial supportHealthy life expectancy at birthFreedom to make life choicesGenerosityPerceptions of corruptionPositive affectNegative affect
Canada20057.4180510.65180.96155271.30.9573060.256230.5026810.8385440.233278
Canada20077.4817510.7392nan71.660.9303410.2494790.4056080.8716040.25681
Canada20087.485610.73840.93870771.840.9263150.2615850.3695880.890220.202175
Canada20097.4878210.69720.94284572.020.9150580.2462170.4126220.8674330.247633
Canada20107.6503510.71650.95376572.20.9339490.2304510.412660.8788680.233113

How to replace all NA values in numerical columns only with median values and update the dataframe

Based on your screenshots, it looks like you're just going back to the RStudio viewer window to look at the data frame again. If so, the issue is this:

When you write test2 %>% mutate_if(...), you're telling R to change something in test2 and return the result (roughly meaning, in this context, to just print the result and show it to you). What you're not telling it to do is to save that result anywhere.

You would want something like test2 <- test2 %>% mutate_if(...) to overwrite the existing test2 data frame in your global environment, or something like test3 <- test2 %>% mutate_if(...) to give it a new name and store the modified thing as a separate object while retaining the old one.

Lastly, I would echo Andrea M's concern that you might not want to do this at all. Imputing missing data with averages is, on a good day, risky.

Replacing NA with 0 in columns that contain a substring in the column name

Update

If it is to select columns having 'keyword' as substring in the column names, use contains to select across those columns

library(dplyr)
library(tidyr)
df1 <- df1 %>%
mutate(across(contains('keyword'), replace_na, 0))

-output

df1
# A tibble: 5 × 4
col1 col2_keyword col3 col4
<int> <chr> <chr> <dbl>
1 1 a a 1
2 2 b b 3
3 3 0 c NA
4 4 c d 5
5 5 d <NA> 6

Assuming that the OP mentioned to replace NA only in columns that have a specific element 'keyword', use where with a logical expression to select the columns that have the 'keyword', loop across those columns and use replace_na to replace the NA to 0

df <- df %>%
mutate(across(where(~ is.character(.x) && 'keyword' %in% .x), replace_na, 0))

-output

df
# A tibble: 5 × 4
col1 col2 col3 col4
<int> <chr> <chr> <dbl>
1 1 a a 1
2 2 b b 3
3 3 keyword c NA
4 4 0 d 5
5 5 c <NA> 6

data

df <- tibble(col1 = 1:5, col2 = c("a", "b", "keyword", NA, 'c'), 
col3 = c('a', 'b', 'c', 'd', NA), col4 = c(1, 3, NA, 5, 6))
df1 <- tibble(col1 = 1:5, col2_keyword = c("a", "b", NA, 'c', 'd'),
col3 =c('a', 'b', 'c', 'd', NA), col4 = c(1, 3, NA, 5, 6))

Replace NA values in data frame with the column mean

library(tidyverse)
df1 <- tibble(x = seq(3), y = c(1, NA, 2))
df1 %>% mutate(y = y %>% replace_na(mean(df1$y, na.rm = TRUE)))
#> # A tibble: 3 × 2
#> x y
#> <int> <dbl>
#> 1 1 1
#> 2 2 1.5
#> 3 3 2

Created on 2022-03-10 by the reprex package (v2.0.0)



Related Topics



Leave a reply



Submit