View New Posts
  1. #1

    Scraping Data from Covers, Creating an NHL Database

    I'm looking for a way to quickly create a database from the NHL information on Covers from the last 5-10 years. For now I guess I am just interested in the scores of each period as well as the moneyline and U/O.

    I do have quite a bit of programming experience, mostly in Java and VB but I haven't done any web-based programming such as scraping or data mining before.

    Any Java libraries you've used to scrape data that might be useful or links to websites with easier information to parse would be greatly appreciated.

  2. #2

    Don't go back further than the lockout and you might want to skip a year or two after the lockout when teams were still adjusting to changes in strategy. There are enough NHL games in a season where sample size shouldn't be an issue.

  3. #3

  4. #4

    If you know VB.NET, you can use either a Webbrowser control, Webclient, or the HttpWebRequest class.

    Personally, I wouldn't scrape from Covers. I'd use a combination of several line services and the NHL website.

    SBR
    Bash 2012
    Attendee 8/17/2012


  5. #5
    strixee's Avatar SBR PRO
    Join Date: 05-31-10
    Posts: 409
    Message Me

    I do have quite a bit of programming experience, mostly in Java and VB but I haven't done any web-based programming such as scraping or data mining before.
    Data mining has nothing to do with web-based programming
    For scraping I recommend using PHP+MySQL, it's especially useful if you want to use some output accessible online (plus it can run on a shared hosting).

    As MonkeyF0cker said, Covers isn't a good source to scrape from. I'm actually shocked, that I see odds just for 1 side like here http://www.covers.com/sports/odds/li...7344&sport=nhl ! Each sportsbook has different vig, so you don't know what the other side price was.
    Covers is too US sports oriented service.
    175 pts

    3-QUESTION
    SBR TRIVIA WINNER 05/13/2013

    175 pts

    3-QUESTION
    SBR TRIVIA WINNER 05/09/2013


  6. #6

    Quote Originally Posted by strixee View Post
    Data mining has nothing to do with web-based programming
    For scraping I recommend using PHP+MySQL, it's especially useful if you want to use some output accessible online (plus it can run on a shared hosting).

    As MonkeyF0cker said, Covers isn't a good source to scrape from. I'm actually shocked, that I see odds just for 1 side like here http://www.covers.com/sports/odds/li...7344&sport=nhl ! Each sportsbook has different vig, so you don't know what the other side price was.
    Covers is too US sports oriented service.
    If you look historically where they're getting their numbers you can infer the other side with reasonable accuracy

  7. #7
    strixee's Avatar SBR PRO
    Join Date: 05-31-10
    Posts: 409
    Message Me

    Here you can get regular season ML odds from Pinnacle since 2009 http://strixee.mysbrforum.com/blog/1...-pinnacle.html
    Nomination(s):
    This post was nominated 3 times . To view the nominated thread please click here. People who nominated: a4u2fear, bullock, and Juret
    175 pts

    3-QUESTION
    SBR TRIVIA WINNER 05/13/2013

    175 pts

    3-QUESTION
    SBR TRIVIA WINNER 05/09/2013


  8. #8

    Quote Originally Posted by mathdotcom View Post
    If you look historically where they're getting their numbers you can infer the other side with reasonable accuracy
    Where does covers get their numbers from?

  9. #9

    Thanks everyone for the replies. I will look into WebHarvest and those VB.NET classes. And thanks for the GREAT link strixee, that, along with the scores of each period should definitely start me in the right direction.

  10. #10

  11. #11

    How come? You don't think NHL is worth the time? Or you don't think I'm worth the time

  12. #12
    durito's Avatar SBR PRO
    Join Date: 07-03-06
    Posts: 13,189
    Message Me

    Quote Originally Posted by KennyPowers View Post
    How come? You don't think NHL is worth the time? Or you don't think I'm worth the time
    Probably doesn't even understand what you are trying to do.

  13. #13

    ive got all odds, results, win streaks going into game, goals for/against going into game for 2011 season.

  14. #14

    Quote Originally Posted by a4u2fear View Post
    ive got all odds, results, win streaks going into game, goals for/against going into game for 2011 season.
    this includes home/away odds, over/under odds.

  15. #15

    Has anyone scraped data from Jeff Sagarin usatoday pitching? I'm looking for a little help to gether data from this site. I have no clue how to, I just know how to use the data.

  16. #16

    Quote Originally Posted by a4u2fear View Post
    this includes home/away odds, over/under odds.
    I am new to this, so pardon me if this is an ignorant question. Is this data from some sort of manual or semi-automated daily collection during the season? Or from an an automated data collection approach?

  17. #17

    Quote Originally Posted by newbottles View Post
    I am new to this, so pardon me if this is an ignorant question. Is this data from some sort of manual or semi-automated daily collection during the season? Or from an an automated data collection approach?
    manually performed it and manipulated. i did however find out that you can do a much easier web query through Excel

  18. #18

  19. #19

    outwit hub is very good for beginners

Top