R: Replacing Na Values by Mean of Hour with Dplyr

R: Replacing NA values by mean of hour with dplyr

Try

  shop.data %>% 
group_by(hour) %>%
mutate(profit= ifelse(is.na(profit), mean(profit, na.rm=TRUE), profit))

# day hour profit
#1 1 8 100
#2 1 16 200
#3 2 8 50
#4 2 16 60
#5 3 8 75
#6 3 16 130

Or you could use replace

  shop.data %>% 
group_by(hour) %>%
mutate(profit= replace(profit, is.na(profit), mean(profit, na.rm=TRUE)))

How to replace NA data of specific dates with the average data of different years of same dates of a dataframe in R?

This might work. It groups the months and days pairs and the replace the NAs from the mean.

library(dplyr)
A <- A %>%
group_by(month, day, hour, minute) %>%
mutate(rain = ifelse(is.na(rain),
mean(rain, na.rm=TRUE), rain))

Replace missing values with corresponding day mean

A solution using dplyr. We can use mutate with ifelse to replace the missing values with NA. The key is to use group_by on the same Day so the mean calculation would be that group only.

library(dplyr)

dt2 <- dt %>%
group_by(Day) %>%
mutate(Sales = ifelse(is.na(Sales), mean(Sales, na.rm = TRUE), Sales)) %>%
ungroup()
dt2
# # A tibble: 9 x 2
# Day Sales
# <fctr> <dbl>
# 1 12-01-17 28.0
# 2 13-01-17 13.0
# 3 14-01-17 2.0
# 4 12-01-17 33.0
# 5 13-01-17 17.0
# 6 14-01-17 11.0
# 7 12-01-17 23.0
# 8 13-01-17 21.0
# 9 14-01-17 6.5

DATA

dt <- read.table(text = "     Day Sales
12-01-17 NA
13-01-17 13
14-01-17 2
12-01-17 33
13-01-17 NA
14-01-17 11
12-01-17 23
13-01-17 21
14-01-17 NA",
header = TRUE)

Replace NA with mean of variable grouped by time and treatment

I think I would just use indexing in base R for this:

within(df, {A1[is.na(A1) & time == 0] <- mean(A1[trt == "2" & time == 0])
B1[is.na(B1) & time == 0] <- mean(B1[trt == "2" & time == 0])})
#> # A tibble: 24 x 4
#> time trt A1 B1
#> <dbl> <fct> <dbl> <dbl>
#> 1 0 2 6.30 5.73
#> 2 0 2 5.43 5.73
#> 3 0 2 5.60 5.45
#> 4 0 1 5.78 5.63
#> 5 0 1 5.78 5.63
#> 6 0 1 5.78 5.63
#> 7 14 2 6.17 6.60
#> 8 14 2 6.43 7.03
#> 9 14 2 6.82 7.12
#> 10 14 1 2.30 3.03
#> # ... with 14 more rows

Created on 2020-05-15 by the reprex package (v0.3.0)

Replace missing value with average of that month

a=with(dat,ave(Occupancy.,sub(".*?\\/","",Date),ID,FUN=function(x)mean(x,na.rm=T)))
> transform(dat,b=replace(x<-Occupancy.,y<-is.na(x),a[y]))
Date ID Occupancy. b
1 1/2/2018 1 95 95.00000
2 2/2/2018 1 94 94.00000
3 3/2/2018 1 94 94.00000
4 4/2/2018 1 96 96.00000
5 5/2/2018 1 94 94.00000
6 6/2/2018 1 NA 94.71429
7 7/2/2018 1 96 96.00000
8 8/2/2018 1 94 94.00000
9 1/2/2018 2 75 75.00000
10 2/2/2018 2 NA 78.33333
11 3/2/2018 2 79 79.00000
12 4/2/2018 2 82 82.00000
13 5/2/2018 2 NA 78.33333
14 6/2/2018 2 76 76.00000
15 7/2/2018 2 78 78.00000
16 8/2/2018 2 80 80.00000

How to replace NA with cero in a columns, if the columns beside have a values? using R

You could also do as follows:

library(dplyr)

mutate(df, X = if_else(is.na(hours) | is.na(interactions), 0, hours))

# hours interactions sales X
# 1 NA NA 1 0
# 2 3 3 1 3
# 3 NA 9 1 0
# 4 8 9 NA 8

Complete missing hour in dataframe with NA using dplyr in R

We can use complete

library(dplyr)
library(tidyr)
mydata %>%
complete(datex, hourx = 0:23)

R Replacing NA values with the next value of another column value within groups

Here's is a possible dplyr solution. This is a combination of ifelse and lead, while the end product should be converted to as.POSIXct again as a result of lost information due to the use of ifelse

library(dplyr)
tmpdf %>%
group_by(spaceNum) %>%
mutate(time.OUT = as.POSIXct(ifelse(is.na(time.OUT), lead(time.IN), time.OUT), origin = "1970-01-01"))
# Source: local data frame [7 x 3]
# Groups: spaceNum
#
# spaceNum time.IN time.OUT
# 1 1 2015-09-04 16:30:00 2015-09-04 18:00:00
# 2 1 2015-09-04 19:50:00 2015-09-04 21:00:00
# 3 1 2015-09-04 21:00:00 <NA>
# 4 2 2015-09-05 12:00:00 2015-09-05 13:00:00
# 5 2 2015-09-05 13:00:00 2015-09-05 13:21:00
# 6 2 2015-09-05 16:00:00 2015-09-05 16:48:00
# 7 2 2015-09-05 17:00:00 <NA>

Impute missing values with the average of the remainder

You can use ave for such operations.

dat$Weight <- 
ave(dat$Weight,dat$Hour,FUN=function(x){
mm <- mean(x,na.rm=TRUE)
ifelse(is.na(x),mm,x)
})
  • You will apply a function by group of hours.
  • For each group you compute the mean wuthout missing values.
  • You assign the mean if the value is a missing value otherwise you keep the origin value.
  • You replace the Weight vector by the new created vector.


Related Topics



Leave a reply



Submit