I have trained an OLS model with 10 input variables and 1 response variable. Besides that, I added a constant manually using add_constant
as provided by Statsmodels.
However, when I try to predict using this model, it raises an error that says
ValueError: shapes (1,10) and (11,) not aligned: 10 (dim 1) != 11 (dim
0)
My initial thought was that’s because I introduced the constant variable, but then since it’s a constant after all, am I still required to put it in the prediction input, i.e., something like predictions = lr.get_prediction(sm.add_constant(out_of_sample_data))
? That seems counter intuitive but I don’t know what else can explain this error, besides using add_constant
turns out to raise the same error again…..
I did some reseach on stackoverflow and most of the question uses predict()
instead of get_prediction
like I did, so those are not that helpful. Is one of them better than the other after all?
Check out my full code below. Any help will be appreciated, thank you!
import statsmodels.api as sm
lr = sm.OLS(y_train, sm.add_constant(x_train)).fit()
predictions = lr.get_prediction(out_of_sample) #the line where the error was raised
frame = predictions.summary_frame(alpha=0.05
you also need to add the constant to the
out-of-sample
data.