Getting Warning: " 'Newdata' Had 1 Row But Variables Found Have 32 Rows" on Predict.Lm

Getting Warning: 'newdata' had 1 row but variables found have 32 rows on predict.lm

This is a problem of using different names between your data and your newdata and not a problem between using vectors or dataframes.

When you fit a model with the lm function and then use predict to make predictions, predict tries to find the same names on your newdata. In your first case name x conflicts with mtcars$wt and hence you get the warning.

See here an illustration of what I say:

This is what you did and didn't get an error:

a <- mtcars$mpg
x <- mtcars$wt

#here you use x as a name
fitCar <- lm(a ~ x)
#here you use x again as a name in newdata.
predict(fitCar, data.frame(x = mean(x)), interval = "confidence")

fit lwr upr
1 20.09062 18.99098 21.19027

See that in this case you fit your model using the name x and also predict using the name x in your newdata. This way you get no warnings and it is what you expect.

Let's see what happens when I change the name to something else when I fit the model:

a <- mtcars$mpg
#name it b this time
b <- mtcars$wt

fitCar <- lm(a ~ b)
#here I am using name x as previously
predict(fitCar, data.frame(x = mean(x)), interval = "confidence")

fit lwr upr
1 23.282611 21.988668 24.57655
2 21.919770 20.752751 23.08679
3 24.885952 23.383008 26.38890
4 20.102650 19.003004 21.20230
5 18.900144 17.771469 20.02882
Warning message:
'newdata' had 1 row but variables found have 32 rows

The only thing I did now was to change the name x when fitting the model to b and then predict using the name x in the newdata. As you can see I got the same error as in your question.

Hope this is clear now!

Warning message 'newdata' had 1 row but variables found have 16 rows in R

Your model is fjbjor ~ amagn, where fjbjor is response and amagn is covariate. Then your newdata is data.frame(fjbjor=5.5).

newdata should be used to provide covariates rather than response. predict will only retain columns of covariates in newdata. For your specified newdata, this will be NULL. As a result, predict will use the internal model frame for prediction, which returns you fitted values.

The warning message is fairly clear. predict determines the expected number of predictions from nrow(newdata), which is 1. But then what I described above happened so 16 fitted values are returned. Such mismatch produces the warning.


Looks like the model you really want is: amagn ~ fjbjor.

newdata' had 1 row but variables found have 10 rows

When you fit a model with the betareg function and then use predict to make predictions, predict tries to find the same names on newdata (not new_data your variable but newdata the parameter of the predict function). In your first case name new_data conflicts with X and hence you get the warning.

To solve your problems you should run this instead:

library(betareg)
library(openxlsx)

input_data<- read.xlsx("dati.xlsx")
Y <- input_data[,1]
X <- input_data[,2:ncol(input_data)]

beta_reg_fit <- betareg(formula = Y ~ data.matrix(X), link = "logit", link.phi = NULL, model = TRUE, y = TRUE, x = FALSE)

X <- data.frame(cbind(1.1, 1.2, 1.4, 1.3))
predictions <- predict(beta_reg_fit, X)

Getting Warning: «'newdata' had 150 rows but variables found have 350 rows» on LDA Predict in R

As suggestion if you have train and test sets, it is better if you use them in this way so that you can avoid potential pitfalls. Try this:

library(MASS)
#Data
N <- 500
x1 <- rnorm(N,0,1)
x2 <- rnorm(N,1,5)
y <- round(runif(N,0,1),0)
xx = data.frame(x1,x2,y)
x_train = xx[1:350,]
x_test = xx[351:N,]
#Models
modelo.lda = lda(y_train~x1+x2,data = x_train)
predict.lda = predict(modelo.lda, newdata=x_test)

No warnings will we produced.

R Warning: newdata' had 15 rows but variables found have 22 rows

solution :

lm_colors <- lm(avg_impressions ~ poly(ad_position, 13), data=colors_train_agg)

Reason :
you can compare yourselves how model.matrix() generates the matrix to score the data inside predict(). So when we pass model(df$var1~df$var2), model.matrix() looks for df$var1 and df$var2 to generate the matrix- but this has dimensions of training data (df). Problem of having different names in the model and in newdata

go through below steps( if you are interested in knowing the cause) :

model1 <- lm(var1~var2, data = df)
model2 <- lm(df$var1~df$var2)
debug(predict)
predict(model1, newdata = df1)
predict(model2, newdata = df1)

Predict Warning on newdata

apl$grp is a vector, but predict requires the newdata argument to be a data frame.* This data frame must contain columns with the same names as the predictor variables used to fit the model (though it can contain other columns as well). So, the following code should work:

predict(mdl, newdata = apl)

You can use predict rather than predict.lm. mdl is an object of class lm, which causes predict to "dispatch" the predict.lm method automatically.


* Strictly speaking, since this is an lm model, the predict "method" that gets dispatched is predict.lm and that method requires that newdata be a data frame. predict.glm also requires a data frame. However, there are some predict methods that can take other types of arguments. For example:

  • The randomForest package has a predict method for randomForest models that can take a data frame or matrix as the newdata argument.
  • The glmnet package has a predict method for glmnet models that requires a matrix, although the argument is called newx rather than newdata in that case.


Related Topics



Leave a reply



Submit