View New Posts
  1. #1

    Back-testing (soccer) model with limited information

    Hi guys,

    (This is my first post and I'm very glad to have found a forum/community with so much quality information and responsens)

    So here's the deal:

    I want to test a model based on a match-index system which contains several factors like form, home/away power etc. However, the only input-data I have is based on the end of the season-stats. So the stats I have are 'too' accurate cos off a bigger sample size (end-season).

    I could (but this will be a LOT of work) genarate the stats I need for every gameround individualy...

    Is there an easier way to back-test this model on accuracy?

    Cheers!

  2. #2

    You need to get stats for each game. Compute your power rankings by date... so if you were analyzing a game played on June 5th, you would only use games played on or before June 4th.

    Realistically though, a pure numbers based soccer model has virtually no chance of succeeding. The changes in lineups (due to injury, discipline and coaching decisions) will change the fair price of the model too much to be overcome by a pure stats model. Believe me, I have tried
    Points Awarded:

    Juret gave Justin7 5 SBR Point(s) for this post.

    SBR
    Bash 2012
    Attendee 8/17/2012


  3. #3

    I have somewhat used a similar thing for a short time as well. Looking at previous games, looking at previous games between the 2 teams etc. It doesn't work as Justin said.Too many factors that can change and due change.
    600pts

    SBR POKER TOURNEY1st Place 6/12/2013

    100pts

    SBR POKER TOURNEY10th Place 6/14/2013

    60pts

    SBR POKER TOURNEY11th Place 6/17/2013

    150pts

    SBR POKER TOURNEY7th Place 6/18/2013

    300pts

    SBR POKER TOURNEY3rd Place 6/5/2013


  4. #4

    Thanks guys. I'm sure Justin is right about the fact that a purely (historical) stats driven model isn't gonna work. For future purposes it wil be updated with real-time information (injuries, derby, importancy of match etc). But I figured these things aren't neccesary for back-testing. Because the sample I want to test it on is big enough to even out these 'micro'-stats.
    I expect the model to be slighty -EV after testing, but want to see if additional (future) information could make up fot that.

    Btw, I found a way to get these game-day-specific stats. Haven't figured out how to use them as ez as possible tho.

    @easyliving: I think history between 2 teams is vastly over-rated.

  5. #5

    Quote Originally Posted by PuffPaffy View Post
    \
    Btw, I found a way to get these game-day-specific stats. Haven't figured out how to use them as ez as possible tho.
    would you mind sharing how I can get these stats. I will look further into it as well
    600pts

    SBR POKER TOURNEY1st Place 6/12/2013

    100pts

    SBR POKER TOURNEY10th Place 6/14/2013

    60pts

    SBR POKER TOURNEY11th Place 6/17/2013

    150pts

    SBR POKER TOURNEY7th Place 6/18/2013

    300pts

    SBR POKER TOURNEY3rd Place 6/5/2013


  6. #6

    http://www.standbundesliga.nl/

    Here you can select stats per game-round for 4 euro-leagues. Have no idea how to link/use these this - I almost think it's easier to make a spreadsheet to calculate these things yourself because you can use more accessable info.

  7. #7

    Quote Originally Posted by Justin7 View Post
    You need to get stats for each game. Compute your power rankings by date... so if you were analyzing a game played on June 5th, you would only use games played on or before June 4th.

    Realistically though, a pure numbers based soccer model has virtually no chance of succeeding. The changes in lineups (due to injury, discipline and coaching decisions) will change the fair price of the model too much to be overcome by a pure stats model. Believe me, I have tried
    This advice is accurate until the distinction for soccer only, this would hold true for all sports.

  8. #8

    Quote Originally Posted by chunk View Post
    This advice is accurate until the distinction for soccer only, this would hold true for all sports.
    Some smaller sports can be beaten while ignoring injury information. If max limits are under $1k, you can probably beat it just looking at a league summary page.

    SBR
    Bash 2012
    Attendee 8/17/2012


  9. #9
    Juret's Avatar SBR PRO
    Join Date: 07-18-10
    Posts: 106
    SBR Points: 1872
    Message Me

    Quote Originally Posted by Justin7 View Post
    Some smaller sports can be beaten while ignoring injury information. If max limits are under $1k, you can probably beat it just looking at a league summary page.
    With a simple regression model for these markets, what z-score over one or two seasons would make you confident having an edge?
    175 pts

    3-QUESTION
    SBR TRIVIA WINNER 06/13/2013


  10. #10

    Quote Originally Posted by Juret View Post
    With a simple regression model for these markets, what z-score over one or two seasons would make you confident having an edge?
    Even in the smaller markets, I'd rather see consistent line movement in the correct direction.

    SBR
    Bash 2012
    Attendee 8/17/2012


Top