Multiple regressions on one dataset?

James Marques · 03-27-14 02:18 PM

I'm not sure if I'm phrasing this question correctly, but here goes:

My theory behind modeling to beat the spread has always been to use the historical accuracy of Vegas lines against them. As many are probably aware, point spread vs favorite winning percentage for college football/basketball and NFL/NBA are very accurately estimated via logarithmic or power regression. However, on the lower (50-55% win percentage) and higher (large point spreads) ends, these regressions break down to a degree. Essentially, this means the regression is only accurate on "average" games -- meaning not close games, and not against big spreads.

However, what if you were to break down your regression into say 3 parts? Is this valid, either statistically or analytically? Would this constitute overfitting? If I model a game using a power y=C*X^B equation for win percentages over say 55%, but a linear fit y=mx + b for games of 50-55% win percentage (and, of course, a third percentage to model the high end)... would this make sense? I've never really considered it before, but I have a model that works pretty accurately in a lot of games, but really blows it in the close ones. Just curious if anyone has any insight.

Thanks

Miz · 03-27-14 08:19 PM

I think that is a pretty good idea overall. People break down complex relationships into linear approximations all the time. I am an engineer and we do this a lot at various portions of a curve for example. Sounds like you are doing the same thing. Best thing to do is just test it on out of sample data. Good luck.

James Marques · 03-29-14 01:38 PM

Thanks!

a4u2fear · 03-29-14 02:09 PM

regressions can have multiple inputs (X) and a single output (Y). I'm not sure if this is what you are referring to in regards to 3 parts. When you perform the regression you can view the "t" and "p" values to find which are the most relevant.

James Marques · 03-29-14 03:57 PM

Originally Posted by a4u2fear

regressions can have multiple inputs (X) and a single output (Y). I'm not sure if this is what you are referring to in regards to 3 parts. When you perform the regression you can view the "t" and "p" values to find which are the most relevant.

More like something like this:
Name: CodeCogsEqn.gif
Views: 201
Size: 2.7 KB

Except with favorite winning percentage as the independent variable, and point spread as the dependent variable. Essentially, solving all those equations for S. Follow me?

Note: those domains are just arbitrary. Just for the example.

Miz · 03-29-14 08:13 PM

I follow you. I don't see any problem with doing this, in principle.

SBR Top-Rated Sportsbooks				Best Sportsbooks List
#1 FanDuel	SBR rating 4.8/5	Review	#6 BetRivers	SBR rating 4.1/5	Review
#2 Caesars	SBR rating 4.7/5	Review	#7 Fanatics	SBR rating 4.1/5	Review
#3 DraftKings	SBR rating 4.7/5	Review	#8 Betway	SBR rating 3.8/5	Review
#4 BetMGM	SBR rating 4.6/5	Review	#9 Borgata	SBR rating 3.5/5	Review
#5 bet365	SBR rating 4.6/5	Review	#10 ClutchBet	SBR rating 2.9/5	Review

Multiple regressions on one dataset?

Thread Tools

Multiple regressions on one dataset?