Mlogit predictions in R for Horse Racing making profit for test data but does not preform well for Train Data

Question

First time asker so let me know if I can clear anything up.

I am trying to predict the winning probabilities of horse’s in horse racing. I am using the mlogit package to do so. My current method of testing is I have ~50 engineered features. I select 5 at random and train the model. I then compare the winning probability to the markets odds for all horses. If the odds are higher than what my model predicted I use Kelly Criterion to have a bet on that horse. For a certain 5 features I can get a profit of ~15% against the test data (50/50 split) but then I run the model on the training data that was used to train the mlogit model and it give -20% profit.

Is this expected? Why would it preform well on test data but not on training data?

Cheers

y<-mlogit.data(dsTest,shape="long", id.var="Raceno")
x<-mlogit.data(dsTrain_after,choice="Wincol",shape="long", id.var="Raceno")
mymod <- (mlogit(Wincol ~ Start_Price_Standardized + Top3_All_From_3_Races_Standardized + 
                   Number_Standardized + length_model_no_price_pred + length_model_price_pred + 
                   place_model_price_pred - 1, data=x))

Leave a Comment Cancel reply