R Predict Function Returning Too Many Values

R predict() function returning wrong/too many values

You must have a variable in newdata that has the same name as that used in the model formula used to fit the model initially.

You have two errors:

  1. You don't use a variable in newdata with the same name as the covariate used to fit the model, and
  2. You make the problem much more difficult to resolve because you abuse the formula interface.

Don't fit your model like this:

mod <- lm(log(Standards[['Abs550nm']])~Standards[['ng_mL']])

fit your model like this

mod <- lm(log(Abs550nm) ~ ng_mL, data = standards)

Isn't that some much more readable?

To predict you would need a data frame with a variable ng_mL:

predict(mod, newdata = data.frame(ng_mL = c(0.5, 1.2)))

Now you may have a third error. You appear to be trying to predict with new values of Absorbance, but the way you fitted the model, Absorbance is the response variable. You would need to supply new values for ng_mL.

The behaviour you are seeing is what happens when R can't find a correctly-named variable in newdata; it returns the fitted values from the model or the predictions at the observed data.

This makes me think you have the formula back to front. Did you mean:

mod2 <- lm(ng_mL ~ log(Abs550nm), data = standards)

?? In which case, you'd need

predict(mod2, newdata = data.frame(Abs550nm = c(1.7812,1.7309)))

say. Note you don't need to include the log() bit in the name. R recognises that as a function and applies to the variable Abs550nm for you.

If the model really is log(Abs550nm) ~ ng_mL and you want to find values of ng_mL for new values of Abs550nm you'll need to invert the fitted model in some way.

R predict() function returning too many values

first, I wouldn't name your result predict - you want to save that for the function. You need

predicted_data <- predict(fit2, newdata = data.frame(PKWH = 9, MDT = 75, MDT2 
= 5625))

It's not throwing an error because predict has a catch-all (...) at the end where input to data is heading, but it's giving you the predictions for the data you fit the model with.

Predict function returns more values than those required

The column names of the data frame sent to predict must match the column names of the data frame used to create the model. If you create x as you show above the names will not be the same and predict will instead use the original data (the frame you call data).

Try this instead

fit <- lm(y ~ ., data[1:40,])
predict(fit, data[41:60,])

The predict()-function is returning unexpected output

There was just a typo. newData = predictionData instead of newdata = predictionData.

Different number of predictions than expecting in linear regression

The second argument to predict is newdata, not data.

Also, you don't need multiple calls to poly in your model formula; poly(N) will be collinear with poly(N-1) and all the others.

Also^2, to generate a sequence of predictions using xv, you have to put it in a data frame with the appropriate name: data.frame(x=xv).

Why does R segmented package's predict function return an error?

You have to update one line of code to make it working.

new <- data.frame(Year = c(min(spring.plot$Year), seg.n.spring$psi[, "Est."], max(spring.plot$Year)))

so the data.frame need a proper column name.

R multiple regression predict output has more values than contained in the test set

Try to use lm2<-predict(lm1, newdata=Taxitest) instead.

Check how this command works using ?predict.lm. If you don't use newdata= it will predict on the dataset you used to train your model.

As an example see below:

# train and test sets
dt1 = mtcars[1:15,]
dt2 = mtcars[20:23,]

# build the model
lm = lm(disp ~ drat, data = dt1)

# check the differences / similarities
predict(lm, data=dt2)
predict(lm, newdata=dt2)
predict(lm, dt2)


Related Topics



Leave a reply



Submit