interpolate in a 3 dimensional spline in R
One good option is the mgcv package which comes with all versions of R. It has isotropic penalised regression splines of two or more variables via s()
and anisotropic penalised regression splines of two or more variables via tensor products and te()
.
If you don't want penalised regression splines, you can use argument fx = TRUE
to fix known degree of freedom splines.
Here is an example from ?te
# following shows how tensor pruduct deals nicely with
# badly scaled covariates (range of x 5% of range of z )
require(mgcv)
test1 <- function(x, z ,sx=0.3, sz=0.4) {
x <- x*20
(pi ** sx * sz) * (1.2 * exp(-(x - 0.2)^2 / sx^2 - ( z - 0.3)^2 / sz^2) +
0.8 * exp(-(x - 0.7)^2 / sx^2 -(z - 0.8)^2 / sz^2))
}
n <- 500
old.par<-par(mfrow=c(2,2))
x <- runif(n) / 20
z<-runif(n)
xs <- seq(0, 1, length=30) / 20
zs <- seq(0, 1, length=30)
pr <- data.frame(x=rep(xs, 30), z=rep(zs, rep(30, 30)))
truth <- matrix(test1(pr$x, pr$z), 30, 30)
f <- test1(x, z)
y <- f + rnorm(n) * 0.2
## model 1 with s() smooths
b1 <- gam(y ~ s(x,z))
persp(xs, zs, truth)
title("truth")
vis.gam(b1)
title("t.p.r.s")
## model 2 with te() smooths
b2 <- gam(y ~ te(x, z))
vis.gam(b2)
title("tensor product")
## model 3 te() smooths specifying margin bases
b3 <- gam(y ~ te(x, z, bs=c("tp", "tp")))
vis.gam(b3)
title("tensor product")
par(old.par)
Multivariate regression splines in R
You can't use splines::bs
in this case, as it is strictly for construction of a univariate spline. If you do bs(mat)
where mat
is a matrix, it is just doing bs(c(mat))
. For example,
mat <- matrix(runif(8), 4, 2)
identical(bs(mat), bs(c(mat)))
# [1] TRUE
This explains why you get double number of rows, when doing bs(cbind(mtcars$wt,mtcars$hp)
.
To create a 2D spline, the simplest way is to create additive spline:
lm(mpg ~ factor(gear) + bs(wt, knots = 5) + bs(hp, knots = 4), mtcars)
but this may not be what you want. Then consider interaction:
model <- lm(mpg ~ factor(gear) + bs(wt, knots = 5):bs(hp, knots = 4), mtcars)
The bs(wt, knots = 5):bs(hp, knots = 4)
is forming row-wise Kronecker product between the two design matrices. Since bs(wt, knots = 5)
is a matrix of 4 columns, and bs(hp, knots = 4)
is a matrix of 3 columns, the interaction has 4 * 3 = 12 columns.
Alternatively, consider using mgcv
package. In mgcv
, multivariate splines can be constructed in two ways:
- isotropic thin-plate splines;
- scale invariant tensor product splines.
Clearly you want the second here, as wt
and hp
have different units. To construct tensor product splines, we can use:
library(mgcv)
fit <- gam(mpg ~ factor(gear)
+ s(wt, bs = 'cr', k = 4, fx = TRUE)
+ s(hp, bs = 'cr', k = 4, fx = TRUE)
+ ti(wt, hp, bs = 'cr', k = c(4, 4), d = c(1, 1), fx = TRUE),
data = mtcars)
Here, I specially set fx = TRUE
to disable penalized regression.
I don't want to write an extensive answer to introduce mgcv
. For how s
, ti
and gam
work, just read documentation. If you need to bridge the gap in theory, grab Simon Wood's book published in 2006: Generalized Additive Models: an introduction with R.
A practical example of
mgcv
usage?
I had an answer Cubic spline method for longitudinal series data which might help you get familiar with mgcv
. But as an introductory example, it only shows how to work with univariate spline. Fortunately, that is also the key. Tensor product spline is constructed from univariate spline.
My other answers related to mgcv
is more of theoretical aspect; while not all my answers related to spline
is making reference to mgcv
. So that question & answer is the best I could give you at this stage.
Would the scale invariant tensor product splines be equivalent to radial smoothing or would that be the isotropic thin-place splines?
Radial smoothing is equivalent to thin-plate spline, as the basis function for a thin-plate spline is radial. That is why it is isotropic and can be used in spatial regression.
Tensor product spline is scale invariant, as it is constructed as (pairwise) multiplication of univariate spline basis.
10-fold CV on cubic spline regression without using cv.glm()
The part where you set up folds gives you folds with roughly equal sizes but i guess it's ok:
folds = sample( 1:k, nrow(Wagetr), replace=TRUE )
You can try:
folds = sample((1:nrow(Wagetr) %% k))+1
We can look at the fit and data:
COLS = c("#FF6F00FF","#C71000FF","#008EA0FF","#8A4198FF","#5A9599FF","#FF6348FF")
with(Wagetr,plot(age,wage))
for(i in 3:8){
cubicfit = glm( wage ~ bs( age, df=i), data=Wagetr)
lines(seq(20,80,by=2),predict(cubicfit,data.frame(age=seq(20,80,by=2))),col = COLS[i-2])
}
legend("topright",fill=COLS,paste0("df=",3:8))
The degree of freedom for splines decides how to bin your independent variable to construct the respective polynomials in each interval. If you agree with me, you can just fit a cubic fit over the whole data? Basically there is no where in the data where you expect a "twist" that requires a knot.
Related Topics
Retain Attributes When Using Gather from Tidyr (Attributes Are Not Identical)
Getting File Path from Shiny UI (Not Just Directory) Using Browse Button Without Uploading the File
Package Domc Not Available for R Version 3.0.0 Warning in Install.Packages
Find Elements Not in Smaller Character Vector List But in Big List
Converting a Data.Frame to a List of Lists
Downloading Files from Ftp with R
List and Description of All Packages in Cran from Within R
Generate Random Integers Between Two Values with a Given Probability Using R
Shiny Dashboard Mainpanel Height Issue
Plot Multiple Datasets with Ggplot
Alignment of Numbers on the Individual Bars with Ggplot2
Generate All Combinations, of All Lengths, in R, from a Vector
Convert Time Object to Categorical (Morning, Afternoon, Evening, Night) Variable in R
R: Scatter Plot Matrix Using Ggplot2 with Themes That Vary by Facet Panel
Unique.Data.Table Select Last Row in Place of the First