How to implement linear interpolation?
As I understand your question, you want to write some function y = interpolate(x_values, y_values, x)
, which will give you the y
value at some x
? The basic idea then follows these steps:
- Find the indices of the values in
x_values
which define an interval containingx
. For instance, forx=3
with your example lists, the containing interval would be[x1,x2]=[2.5,3.4]
, and the indices would bei1=1
,i2=2
- Calculate the slope on this interval by
(y_values[i2]-y_values[i1])/(x_values[i2]-x_values[i1])
(iedy/dx
). - The value at
x
is now the value atx1
plus the slope multiplied by the distance fromx1
.
You will additionally need to decide what happens if x
is outside the interval of x_values
, either it's an error, or you could interpolate "backwards", assuming the slope is the same as the first/last interval.
Did this help, or did you need more specific advice?
How to implement linear interpolation method in java array?
If you want to interpolate intervals to different count of numbers, you can just add the count of output numbers to function parameter.
Example:
/***
* Interpolating method
* @param start start of the interval
* @param end end of the interval
* @param count count of output interpolated numbers
* @return array of interpolated number with specified count
*/
public static double[] interpolate(double start, double end, int count) {
if (count < 2) {
throw new IllegalArgumentException("interpolate: illegal count!");
}
double[] array = new double[count + 1];
for (int i = 0; i <= count; ++ i) {
array[i] = start + i * (end - start) / count;
}
return array;
}
Then you can just call interpolate(0, 6, 6);
or interpolate(6, 12, 6);
or interpolate(6, 12, 12);
or whatever you want.
Linear Interpolation. How to implement this algorithm in C ? (Python version is given)
Interpolation in the sense of "signal sample rate increase"
... or i call it, "upsampling" (wrong term, probably. disclaimer: i have not read Lyons'). I just had to understand what the code does and then re-write it for readability. As given it has couple of problems:
a) it is inefficient - two loops is ok but it does multiplication for every single output item; also it uses intermediary lists(hold
), generates result with append
(small beer)
b) it interpolates wrong the first interval; it generates fake data in front of the first element. Say we have multiplier=5 and seq=[20,30] - it will generate [0,4,8,12,16,20,22,24,28,30] instead of [20,22,24,26,28,30].
So here is the algorithm in form of a generator:
def upsampler(seq, multiplier):
if seq:
step = 1.0 / multiplier
y0 = seq[0];
yield y0
for y in seq[1:]:
dY = (y-y0) * step
for i in range(multiplier-1):
y0 += dY;
yield y0
y0 = y;
yield y0
Ok and now for some tests:
>>> list(upsampler([], 3)) # this is just the same as [Y for Y in upsampler([], 3)]
[]
>>> list(upsampler([1], 3))
[1]
>>> list(upsampler([1,2], 3))
[1, 1.3333333333333333, 1.6666666666666665, 2]
>>> from math import sin, pi
>>> seq = [sin(2.0*pi * i/10) for i in range(20)]
>>> seq
[0.0, 0.58778525229247314, 0.95105651629515353, 0.95105651629515364, 0.58778525229247325, 1.2246063538223773e-016, -0.58778525229247303, -0.95105651629515353, -0.95105651629515364, -0.58778525229247336, -2.4492127076447545e-016, 0.58778525229247214, 0.95105651629515353, 0.95105651629515364, 0.58778525229247336, 3.6738190614671318e-016, -0.5877852522924728, -0.95105651629515342, -0.95105651629515375, -0.58778525229247347]
>>> list(upsampler(seq, 2))
[0.0, 0.29389262614623657, 0.58778525229247314, 0.76942088429381328, 0.95105651629515353, 0.95105651629515364, 0.95105651629515364, 0.7694208842938135, 0.58778525229247325, 0.29389262614623668, 1.2246063538223773e-016, -0.29389262614623646, -0.58778525229247303, -0.76942088429381328, -0.95105651629515353, -0.95105651629515364, -0.95105651629515364, -0.7694208842938135, -0.58778525229247336, -0.29389262614623679, -2.4492127076447545e-016, 0.29389262614623596, 0.58778525229247214, 0.76942088429381283, 0.95105651629515353, 0.95105651629515364, 0.95105651629515364, 0.7694208842938135, 0.58778525229247336, 0.29389262614623685, 3.6738190614671318e-016, -0.29389262614623618, -0.5877852522924728, -0.76942088429381306, -0.95105651629515342, -0.95105651629515364, -0.95105651629515375, -0.76942088429381361, -0.58778525229247347]
And here is my translation to C, fit into Kratz's fn template:
/**
*
* @param src caller supplied array with data
* @param src_len len of src
* @param steps to interpolate
* @param dst output param will be filled with (src_len - 1) * steps + 1 samples
*/
float* linearInterpolation(float* src, int src_len, int steps, float* dst)
{
float step, y0, dy;
float *src_end;
if (src_len > 0) {
step = 1.0 / steps;
for (src_end = src+src_len; *dst++ = y0 = *src++, src < src_end; ) {
dY = (*src - y0) * step;
for (int i=steps; i>0; i--) {
*dst++ = y0 += dY;
}
}
}
}
Please note the C snippet is "typed but never compiled or run", so there might be syntax errors, off-by-1 errors etc. But overall the idea is there.
Floating point linear interpolation
As Jason C points out in the comments, the version you posted is most likely the best choice, due to its superior precision near the edge cases:
float lerp(float a, float b, float f)
{
return a * (1.0 - f) + (b * f);
}
If we disregard from precision for a while, we can simplify the expression as follows:
a(1 − f) × (b − a)
= a − af + bf
= a + f(b − a)
Which means we could write it like this:
float lerp(float a, float b, float f)
{
return a + f * (b - a);
}
In this version we've gotten rid of one multiplication, but lost some precision.
Having trouble conceptualising how to implement a linear interpolation in R
Not the most elegant solution, but this uses dplyr
and magrittr
. First, I define your data frame.
# Data frame called df
# Date Irradiance AirMass0
# 1 2013-01-04 10:43:00 1055.64 0.02798423
# 2 2013-01-12 20:16:00 0.00 0.01952277
# 3 2013-01-12 11:48:00 975.22 0.01946854
# 4 2013-01-08 07:19:00 0.25 0.03845099
# 5 2013-01-04 08:19:00 953.33 -0.14285513
# 6 2013-01-14 10:11:00 1017.62 0.03227589
Next, I load the relevant libraries.
# Load libraries
library(dplyr)
library(magrittr)
Here, I create a function that takes an air mass and given x
value (i.e., 0.9804), creates a reference data frame (i.e., x
equal to 0.5, 0.75, and 1 and corresponding y
values), then creates a function that will estimate y
based on x
through linear interpolation.
# Calculate pressure correction based on air mass and x
pres_cor <- function(m, x){
# Create reference data frame
ref_df <- data.frame(x_ref = c(0.5 , 0.75 , 1),
y_ref = c(1.68219 - 0.03059 * m + 0.000890 * m^2, 1.248274 - 0.011997 * m + 0.000370 * m^2, 1))
# Create function for interpolation
int_fun <- with(ref_df, approxfun(x_ref, y_ref))
# Return value at given x value
int_fun(x)
}
Finally, I apply this to each row of your data frame using the pipe operator (%>%
), rowwise
, and mutate
from dplyr
, and the compound assignment pipe (%<>%
) from magrittr
.
# Use function for each row
df %<>%
rowwise %>%
mutate(y = pres_cor(AirMass0, 0.9804))
This gives the following:
# # A tibble: 6 x 4
# Date Irradiance AirMass0 y
# <dttm> <dbl> <dbl> <dbl>
# 1 2013-01-04 10:43:00 1056. 0.0280 1.02
# 2 2013-01-12 20:16:00 0 0.0195 1.02
# 3 2013-01-12 11:48:00 975. 0.0195 1.02
# 4 2013-01-08 07:19:00 0.25 0.0385 1.02
# 5 2013-01-04 08:19:00 953. -0.143 1.02
# 6 2013-01-14 10:11:00 1018. 0.0323 1.02
Note that y
values look the same due to rounding, but are not upon closer inspection.
# df$y
# [1] 1.019438 1.019446 1.019446 1.019429 1.019600 1.019434
How is linear interpolation of data sets usually implemented?
The way I usually implement O(1) interpolation is by means of an additional data structure, which I call IntervalSelector
that in time O(1) will give the two surrounding values of the sequence that have to be interpolated.
An IntervalSelector
is a class that, when given a sequence
of n
abscissas builds and remembers a table
that will map any given value of x
to the index i
such that sequence[i] <= x < sequence[i+1]
in time O(1).
Note: In what follows arrays are 1 based.
The algorithm that builds the table proceeds as follow:
- Find
delta
to be the minimum distance between two consecutive elements in the inputsequence
of abscissas. - Set
count := (b-a)/delta + 1
, wherea
andb
are respectively the first and last of the (ascending)sequence
and/
stands for the integer quotient of the division. - Define
table
to be anArray
ofcount
elements. - For
i
between1
andn
settable[(sequence[j]-a)/delta + 1] := j.
- Repeat every entry of
table
visited in 4 to the unvisited positions that come right after it.
On output, table
maps j
to i
if (j-1)*d <= sequence[i] - a < j*d.
Here is an example:
Since elements 3rd and 4th are the closest ones, we divide the interval in subintervals of this smallest length. Now, we remember in the table
the positions of the left end of each of these deta-
intervals. Later on, when an input x
is given, we compute the delta-
interval of such x
as (x-a)/delta + 1
and use the table to deduce the corresponding interval in the sequence. If x
falls to the left of the i
th sequence element, we choose the (i-1)
th.
More precisely:
Given any input x
between a
and b
calculate j := (x-a)/delta + 1
and i := table[j].
If x < sequence[i]
put i := i - 1
. Then, the index i
satisfies sequence[i] <= x < sequence[i+1]
; otherwise the distance between these two consecutive elements would be smaller than delta
, which is not.
Remark: Be aware that if the minimum distance delta
between consecutive elements in sequence
is too small the table will have too many entries. The simple description I've presented here ignores these pathological cases, which require additional work.
Pandas linear interpolation for geometrical X-Y data seems to ignore points
Most likely the problem is that the timestamps in the original and resampled DataFrames are not aligned, so when resampling we need to specify how to deal with that.
Since the original is at 50 Hz and the resampled is at 2500 Hz, simply taking mean
should fix it:
upsampled = new_df.resample('0.4ms').mean().interpolate(method='linear')
Unfortunately, without having any sample data, I cannot verify that it works. Please let me know if it does help
Related Topics
Can You List the Keyword Arguments a Function Receives
Why am I Getting a Nameerror When I Try to Call My Function
What Does "Error: Option --Single-Version-Externally-Managed Not Recognized" Indicate
Attributeerror: Can Only Use .Dt Accessor with Datetimelike Values
Remove All Newlines from Inside a String
Django Template Can't Loop Defaultdict
Pandas: Valueerror: Cannot Convert Float Nan to Integer
Python Flask Intentional Empty Response
Error Installing Psycopg2 on MACos 10.9.5
How to Enable MySQL Client Auto Re-Connect with MySQLdb
Exponentials in Python: X**Y VS Math.Pow(X, Y)
Nameerror: Name 'Datetime' Is Not Defined
Filtering a List Based on a List of Booleans
Curses Alternative for Windows
Print List of Lists in Separate Lines
Python Library 'Unittest': Generate Multiple Tests Programmatically