Add Missing Xts/Zoo Data with Linear Interpolation in R

Add missing xts/zoo data with linear interpolation in R

You can merge your data with a vector with all dates. After that you can use na.approx to fill in the blanks (NA in this case).

data1 <-read.table(text="time, value
2012-11-30-10:28:00, 12.9
2012-11-30-10:29:00, 5.5
2012-11-30-10:30:00, 5.5
2012-11-30-10:31:00, 5.5
2012-11-30-10:32:00, 9
2012-11-30-10:35:00, 9
2012-11-30-10:36:00, 14.4
2012-11-30-10:38:00, 12.6", header = TRUE, sep=",", as.is=TRUE)
times.init <-as.POSIXct(strptime(data1[,1], '%Y-%m-%d-%H:%M:%S'))
data2 <-zoo(data1[,2],times.init)
data3 <-merge(data2, zoo(, seq(min(times.init), max(times.init), "min")))
data4 <-na.approx(data3)

Interpolate zoo object with missing Dates

Merge with an "empty" object that has all the dates you want, then use na.approx (or na.spline, etc.) to fill in the missing values.

x <- merge(serie, zoo(,seq(start(serie),end(serie),by="day")), all=TRUE)
x <- na.approx(x)

Delete specific values in R with zoo/xts

It is not clear what do you want to do. But I guess you want to remove some outliers from xts object. If you want a solution like "na.rm", one idea is to replace non desired values by NA then you remove them using na.omit.

x <- read.zoo(text='
"2012-04-09 05:03:00",2
"2012-04-09 05:04:00",4
"2012-04-09 05:05:39",-10
"2012-04-09 05:09:00",0
"2012-04-09 05:10:00",1',sep=',',tz='')

x[x == -10] <- NA
na.omit(x)

x
2012-04-09 05:03:00 2
2012-04-09 05:04:00 4
2012-04-09 05:09:00 0
2012-04-09 05:10:00 1

EDIT

To get condition per date , you can look at index(x) and format it for example.

format(index(dat),'%S')
[1] "00" "00" "39" "00" "00"

But here I use built-in .indexsec ( see also .indexmin, .indexhour,..)

dat[.indexsec(dat) != 0]
2012-04-09 05:05:39
-10

Manipulating zoo object column after imputation

It sounds like you haven't converted the zoo object to a more generic R object (but you haven't given an error message or code that produces it, so I can't be 100% sure).

In that case, you can use the as.vector function (see https://www.rdocumentation.org/packages/zoo/versions/1.8-6/topics/as.zoo), to convert a zoo object into a vector, which you can add to a data.frame.

The example code below removes imputeTS, like what G. Grothendieck says in his comment, since zoo's na.approx does linear interpolation.

# install.packages("zoo")
library("zoo")

DateTimes <- as.POSIXct(c(
"2009-01-01 00:00:00", "2009-01-01 01:00:00",
"2009-01-01 02:00:00", "2009-01-01 03:00:00",
"2009-01-01 04:00:00", "2009-01-01 05:00:00", "2009-01-01 06:00:00"))
MeanTemp <- c(0.8, 0.7, 0.7, NA, 0.8, 0.9, 1.1)
HourTemp <- data.frame(DateTimes, MeanTemp)
TempImp <- zoo(HourTemp$MeanTemp, HourTemp$DateTimes)

# use zoo's linear interpolation
HourTemp$airTempImp <- as.vector(na.approx(TempImp))
HourTemp$Imputed <- ifelse(is.na(HourTemp$MeanTemp), "Imputed", "Observed")

# calculates the heating degree day per hour if temp > 15.5,
# else sets to 0 (no heating)
HourTemp$HeatingDegreeDay <- ifelse(
HourTemp$airTempImp > 15.5,
0, # no heating
(15.5 - HourTemp$airTempImp) / 24
)

which will output:

HourTemp
DateTimes MeanTemp airTempImp Imputed HeatingDegreeDay
1 2009-01-01 00:00:00 0.8 0.80 Observed 0.6125000
2 2009-01-01 01:00:00 0.7 0.70 Observed 0.6166667
3 2009-01-01 02:00:00 0.7 0.70 Observed 0.6166667
4 2009-01-01 03:00:00 NA 0.75 Imputed 0.6145833
5 2009-01-01 04:00:00 0.8 0.80 Observed 0.6125000
6 2009-01-01 05:00:00 0.9 0.90 Observed 0.6083333
7 2009-01-01 06:00:00 1.1 1.10 Observed 0.6000000

Linear Interpolation using dplyr

The solution I've gone with is based on the first comment from @docendodiscimus

Rather than attempt to create a new data frame as I'd been doing this approach simply adds columns to the existing data frame by taking advantage of dplyr's mutate() function.

My code is now...

df %>%
group_by(variable) %>%
arrange(variable, event.date) %>%
mutate(ip.value = na.approx(value, maxgap = 4, rule = 2))

The maxgap allows upto four consecutive NA's, whilst the rule option allows extrapolation into the flanking time points.

Quotation marks when zoo to xts using as.xts() in R

xts is not "adding quotation marks". xts prints character data with quotation marks, but zoo does not. str(returns) and str(as.xts(returns)) should both show that the coredata of the objects is character. This is because "#N/A" cannot be converted to a number.

You don't say how you're reading from Excel (though the tags you added suggest you are), but there are usually ways to specify how NA values are represented. For example, read.csv has a na.strings argument you can set to "#N/A".

How can I alter a time series (XTS or ZOO) in R?

Try

index(master21) <- index(master21) + 60    # adds a minute

which will add a minute to the time index. You can then use merge() as the timestamps align.

More generally, the vignettes of the zoo package will be useful for you too.

Addition over zoo/xts objects in R

Not the prettiest solution, but from the top of my head...

test<-merge(ob1,ob2)
test<-xts(rowSums(test, na.rm=T), order.by = time(test))


Related Topics



Leave a reply



Submit