Forward and Backward Fill Data Frame in R

Forward and backward fill data frame in R

We can do this with na.locf from zoo

library(zoo)
na.locf(na.locf(df1), fromLast = TRUE)
# Col1 Col2
#1 20 10
#2 25 10
#3 15 10
#4 15 10
#5 15 15

Filling missing values (NAs) backward and forward using another column as support

You could do;

library(tidyverse)
input %>%
group_by(group) %>%
mutate(v1 = unlist(accumulate2(value, tail(gr, -1), ~if(is.na(..2)) ..1*(1+..3) else ..2)),
v1 = rev(unlist(accumulate2(rev(v1), head(rev(gr), -1), ~if(is.na(..2)) ..1/(1+..3) else ..2))))
# A tibble: 15 x 4
# Groups: group [3]
group value gr v1
<chr> <dbl> <dbl> <dbl>
1 A 10 0.1 10
2 A 15 0.05 15
3 A 17 0.03 17
4 A NA 0.02 17.3
5 A NA 0.05 18.2
6 B NA 0.04 7.35
7 B NA 0.02 7.5
8 B 12 0.6 12
9 B 16 0.03 16
10 B 13 0.4 13
11 C 12 0.01 12
12 C NA 0.09 13.1
13 C 15 0.05 15
14 C NA -0.03 14.6
15 C 19 0.04 19

Forward fill rows in a r data table

Use fill in tidyr to fill in missing values with previous value.

library(dplyr)
library(tidyr)

df %>%
pivot_longer(3:7) %>%
group_by(Name) %>%
fill(value) %>%
ungroup() %>%
pivot_wider()

# # A tibble: 5 x 7
# Name Value `1` `2` `3` `4` `5`
# <fct> <int> <int> <int> <int> <int> <int>
# 1 A 58 1 1 1 1 1
# 2 B 47 NA 1 1 1 1
# 3 C 89 NA NA 1 1 1
# 4 D 68 NA NA NA 1 1
# 5 E 75 NA NA NA NA 1

Note: The output above is the same as

df %>% fill(3:7, .direction = "up")

but the logic is different. The former belongs to "filling rows forward" and the latter is "filling columns backward". They will differ in other cases.


Data

df <- structure(list(Name = structure(1:5, .Label = c("A", "B", "C", 
"D", "E"), class = "factor"), Value = c(58L, 47L, 89L, 68L, 75L
), `1` = c(1L, NA, NA, NA, NA), `2` = c(NA, 1L, NA, NA, NA),
`3` = c(NA, NA, 1L, NA, NA), `4` = c(NA, NA, NA, 1L, NA),
`5` = c(NA, NA, NA, NA, 1L)), class = "data.frame", row.names = c(NA, -5L))

Expand and then fill a dataframe

Check out the fill() function function through tidyverse.

Using your example, but inducing the NA's you mention, df5 should be what you're looking for here.

library( tidyverse )
year <- c(2014, 2019, 2021)
price <- c(100, 110, 120)
df1 <- data.frame(cbind(id=1, year, price))

year <- c(2016, 2019, 2021)
price <- c(200, 210, 220)
df2 <- data.frame(cbind(id=2, year, price))

year <-c (2014, 2015, 2019, 2020)
price <-c (300, 310, 320, 330)
df3 <- data.frame(cbind(id=3, year, price))

list1 <- list(df1, df2, df3)

id <- c(rep(1,8), rep(2,8), rep(3,8))
year <- c(rep(seq(2014,2021), 3))
price <- c(100, NA, NA, NA, NA, 110, NA, 120,
NA, NA, 200, NA, NA, 210, 210, 220,
300, 310, 310, 310, 310, 320, 330, 330)
df4 <- data.frame(id, year, price)
df5 <- df4 %>% group_by( id ) %>% fill( price, .direction = "downup" )

forward fill all missing values for all variables

We can convert the column names to symbols with syms and evaluate (!!!)

d1 %>% 
tidyr::fill(!!! rlang::syms(names(.)), .direction = 'down')
# A tibble: 3 x 2
# var1 var2
# <dbl> <dbl>
#1 NA NA
#2 1 3
#3 1 3

data

d1 <- data_frame(var1 = c(NA, 1 , NA), var2 = c (NA, 3, NA))

Filling missing values using forward and backward fill in pandas dataframe (ffill and bfill)

You can use ffill and bfill if need replace NaN values forward and backward filling:

print (df)
A B
DateTime
01-01-2017 03:27 NaN NaN
01-01-2017 03:28 NaN NaN
01-01-2017 03:29 0.181277 -0.178836
01-01-2017 03:30 0.186923 -0.183261
01-01-2017 03:31 NaN NaN
01-01-2017 03:32 NaN NaN
01-01-2017 03:33 0.181277 -0.178836

data = df.ffill().bfill()
print (data)
A B
DateTime
01-01-2017 03:27 0.181277 -0.178836
01-01-2017 03:28 0.181277 -0.178836
01-01-2017 03:29 0.181277 -0.178836
01-01-2017 03:30 0.186923 -0.183261
01-01-2017 03:31 0.186923 -0.183261
01-01-2017 03:32 0.186923 -0.183261
01-01-2017 03:33 0.181277 -0.178836

Which is same as the function fillna with parameters:

data = df.fillna(method='ffill').fillna(method='bfill')

Fill missing values by rolling forward in each group using data.table

You can use na.locf() function from the zoo package:

DT[, VAL:=zoo::na.locf(VAL, na.rm = FALSE), "CLASS"]

How can I replace the NULL values in dataframe with Average of Forward and backward fill?

We can use na.approx

library(zoo)
df1[-1] <- na.approx(df1[-1])
df1
# A B C
#1 1 1.0 2000
#2 2 2.5 3500
#3 3 4.0 5000
#4 4 5.5 6500
#5 5 7.0 8000

Or with lapply

df1[-1] <- lapply(df1[-1], na.approx)

Or used along with dplyr

library(dplyr)
df1 %>%
mutate_if(is.numeric, na.approx)

Or with data.table

library(data.table)
setDT(df1)[, (2:3) := lapply(.SD, na.approx), .SDcols = 2:3]

data

df1 <- structure(list(A = 1:5, B = c(1L, NA, 4L, NA, 7L), C = c(2000L, 
NA, 5000L, NA, 8000L)), class = "data.frame", row.names = c(NA,
-5L))


Related Topics



Leave a reply



Submit