View New Posts
1. ## Testing the statistical significance of a strategy

Hi all,

I have a data set containing over 5 000 soccer matches (1X2 odds and results). I would like to test some strategies but Im not quite sure how to test whether the strategies are statistically significant. I hope somebody could guide me with this simple example so that I could try to get the hang of it.

I have calculated returns of various simple strategies where one places one unit (1\$) bets on the outcome that fulfills certain conditions. A simple example is a strategy of always backing the home team.

Example of how I have calculated the returns:

12.8.2007 Manchester U – Reading 0-0 - 1.22-6.80-23.00 => return on 1\$ bet: -1\$ or -100%.
16.9.2007 Manchester C – Aston Villa 1-0 – 2.92-3.15-2.92 => return on 1\$ bet: 1.92\$ or 192%

Then I have calculated the historical average of these returns (over all matches).

Now, is there a way to test whether the average return is due to chance or whether it is statistically significant?

I googled and searched this forum and noticed that people suggest z-test.

Can I for example use the Central Limit Theorem to calculate the normal test statistic Z? At least one problem though is that I do not know the mean. I think the relevant benchmark should be a mean return of 0%. So in this particular case, can I do as follows:

average return ~ N(µ, s^2/n)

H0: µ=0

Z= [ 0 - (realized mean return) ] / (realized standard deviation of the mean returns) ?

2. You need to repost this in the Handicapper Think Tank
400pts

SBR POKER TOURNEY2nd Place 5/7/2013

40pts

SBR POKER TOURNEY12th Place 5/17/2013

250pts

SBR POKER TOURNEY4th Place 5/10/2013

600pts

SBR POKER TOURNEY1st Place 5/2/2013

60pts

SBR POKER TOURNEY11th Place 5/14/2013

3. Ok, thanks.

4. Yeah, most of us use z-scores to get some idea of how good something is. I use it to with the Binomial or Beta dist. to determine "how valid" a win% ATS is. I'm sure there is some way to jigger it for your ML example; maybe one of the frequentists will pop up and tell us "the right way" to do it.

5. profit in units/SQRT(samples) will get you Z score
1-NORM.S.DIST(Zscore;1) will get you the % it happens by chance

these numbers will not be accurate but will do for starters to see where youre at

6. I think the major question is can you simply apply CLT to sports odds/probability the same way you do it to a roll of the dice, for example.
For that you have to make an assumption that probability of results in your data set by odds is in fact as true as provability to see 6:6 on a dice roll to be 1 in 36.
1 in 36 is a mathematical fact as is a probability of any other roll and results will be normally distributed.
Weather or not probability of team winning is 62% (or whatever odds in your DB make it to be) is mathematical fact is a big question.
Because if it is not a true probability, then bell curve in sports betting will be as useless as it is in stock market investing.

7. This is a great question, and one that I have struggled with for several years without coming to a final solution, how to definitively show that a trend\system\method is profitable enough to bet with.

Z-scores are an approach, but keep in mind that indvidual betting returns do not follow a recognized distribution. If you picture a curve showing the outcomes of a series of discrete bets, there will be a beta-looking curve with a mean near 1.0 and a large bulge exactly on -1.0. So there are several transformations needed to use the data in this way.

A time-series look is also an approach, looking at the performance of the system over time, using dates on the x-axis and a running balance on the y-axis. I've looked at the data this way a few times, but never felt comfortable enough to rely on the results.

The hardest part, though, and ultimately where all of my attempts to use trend\system\method approaches have failed, is that its not enough to prove that the return from your system is statistically significant when backtested, you need to show that the result will remain statistically significant as you make bets going forward. To do that you need to exclude a time period from your analysis, say the last year, and compare the result. I've often found that even when I find trends that show a high degree of statistical significance over say four years, in the fifth year that advantage disappears (the performance falls back to the expected mean). Frustrating.

But I welcome other approaches to doing this. I will give points to someone who comes up with such a solution!

8. Originally Posted by juggeri
Can I for example use the Central Limit Theorem to calculate the normal test statistic Z? At least one problem though is that I do not know the mean. I think the relevant benchmark should be a mean return of 0%.
Oh, and just to point out that the mean return will not be 0% unless you are making no-vig bets, it would be -0.045455% for -110 spread bets, and you need to monte-carlo a series of random bets to get a mean for money-line bets.

The -0.045455% is calculated as:

If you assume that both teams are 50% as likely to win the game at a given spread with a cost of -110, the expected mean return for any series of bets is equal to the expected return from betting both sides of the same game, namely, \$11+\$11 = \$22 dollars bet and you'll recieve \$10+\$11 = \$21 from the one that wins a profit of -\$1 on a \$22 bet, or -0.045455%.

9. Interesting question that I think all punters struggle with. I'm not sure that there would ever be a satisfactory answer.

10. Thank you for your contributions.

First, I know only basics of maths and stats so my theoretical understanding is not at a high level. Yet, I've read a couple of interesting books/articles about statistics and the real world. What I've learned is that if your distribution assumptions are even slightly off, the results will most likely be bullshit.

11. Originally Posted by juggeri
What I've learned is that if your distribution assumptions are even slightly off, the results will most likely be bullshit.

12. Originally Posted by juggeri

First, I know only basics of maths and stats so my theoretical understanding is not at a high level. Yet, I've read a couple of interesting books/articles about statistics and the real world. What I've learned is that if your distribution assumptions are even slightly off, the results will most likely be bullshit.
Making assumptions about the underlying distribution is the prime reason why Frequentists are poor modelers.

13. OK, what is a frequentist?

14. z-score is a reward to variability ratio and its simple formula is return/standard deviation
an ATS bet standard deviation is close to 1, thats a fact
so how can one be slightly off here and get "bullshit" ?

if you have a reasonable sample size, z-score is a fine way to compare different models imo

15. Originally Posted by tukkk
z-score is a reward to variability ratio and its simple formula is return/standard deviation
an ATS bet standard deviation is close to 1, thats a fact
so how can one be slightly off here and get "bullshit" ?

if you have a reasonable sample size, z-score is a fine way to compare different models imo

That's not quite right, unless I misunderstood you.

I wrote a quick monte-carlo simulation of making sequences of 11/10 (-110 or 1.91) bets of varying lengths and used excel to plot a curve of the standard deviations. A power curve fit perfectly.

For any sequence of X bets, the expected return from those bets is -0.045455 with a standard deviation of 0.9545455x^-0.5

Interestingly, I think you can extend this for any sequence of bets of x length with probability of payoff p and decimal odds of d. I backtested a few different equations and they seem to look like:

Code:
```
Expected (mean) return = d * p - 1
Standard deviation of the expected return = (d * p) * x ^ (p - 1)```
so as a check: x=100, p=0.5, d=1.90909

Code:
```
Expected return = 0.5 * 1.90909 - 1 = 0.95455 - 1 = -0.045455
Standard deviation = (0.5 * 1.90909) * 100 ^ (0.5 - 1) =  0.95455 * 100 ^ -0.5 = 0.095455```
checked by simulation to be correct...

16. podonne, the sd is 0,95455
and it gets closer to 1 the better you are
heres a 1000 bet example:
1)for 1000 bets the sd is 30,19 , so for 1u bets on every game, the Z-score would be -45,455/30,19= -1,506
2)if you used 1 instead youd get sd 31,62, the Z-score would -45,455/sqrt(1000)= -1,437

Are those scores far apart?

17. Originally Posted by ChuckyTheGoat
OK, what is a frequentist?
http://www.statisticalengineering.co..._bayesians.htm

18. Originally Posted by tukkk
podonne, the sd is 0.95455
and it gets closer to 1 the better you are
heres a 1000 bet example:
1)for 1000 bets the sd is 30.19 , so for 1u bets on every game, the Z-score would be -45.455/30.19= -1.506
2)if you used 1 instead youd get sd 31.62, the Z-score would -45.455/sqrt(1000)= -1.437

Are those scores far apart?
I'm not sure what you mean. The Standard deviation does not gravitate towards any particular number. If you mean the p-value (derived from the Z-score) that represents the probability that the result you have is significantly different from the distribution modled by the mean and standard deviation, then yes, the closer to 1.0 you get, the more confident you are that the result you have is better

19. what i meant is that standard deviation is 0,95455 when youre flipping a coin
of course sd doest gravitate towards 1 but if you have an edge, it is closer to 1
the point is that it is close to 1
the debate here is whether one can use z-score to compare models

20. I have a question v similar to the original poster, i.e. can i say sth to the effect that "my results are due to random luck x% of the time"? and ultimately jump the conclusion- "i am playing with an edge".

i used binomial distribution function in excel and come up with an opinion, "my results can be achieved by a breakeven player 12% of the time..." may i ask if at the first glance whether the opinion sounds completely way off? and if so could u pls advise what are the things that probably went wrong?

summary of my results are listed out below, for detailed breakeven pls refer to the attached doc:

 Summary Sport Type G W L P Hit Unit MLB 1H side 11 5 6 0 45% -0.25 MLB Side 24 11 12 1 46% 0.44 MLB 1H total 23 12 6 5 52% 7.68 MLB Total 32 19 9 4 59% 11.99 WNBA Side 4 2 2 0 50% 0.1 WNBA Total 3 3 0 0 100% 3.06 Total 97 52 35 10 54% 23.02 Commission paid 2.425

21. This webpage does a better job that I could ever do in describing the divide in a laconic manner, but I think it does include an over-simplification: It implies that parameters and other random variates are one and the same in the eyes of a Bayesian, which is not true.