SBR Forum - Free Picks & Sports Handicapping LegendZ The PIG WSEX
BetPhoenix BetJamaica Matchbook BetOnline
Carib 5Dimes The Greek Intertops
SBR Recommended Sportsbooks
1. Pinnacle Sports ... SBR Rating A+ ... Pinnacle Sports Review
2. The Greek Sports Book ... SBR Rating A+ ... The Greek Review
3. BookMaker ... SBR Rating A+ ... BookMaker Review
4. BetJamaica ... SBR Rating A+ ... BetJamaica Review
5. LegendZ Sports ... SBR Rating A+ ... LegendZ Review
Posters' Top Rated Sportsbooks
1. Matchbook ... 195 total points ... Matchbook Review
2. BetJamaica ... 182 total points ... BetJamaica Review
3. The Greek Sports Book ... 160 total points ... The Greek Review
4. Pinnacle Sports ... 130 total points ... Pinnacle Sports Review
5. 5Dimes ... 125 total points ... 5Dimes Review
Go Back   Sports Handicapping - Sports Betting - Sports Picks - SBR Forum > Sports Betting, Sportsbooks & General Discussion > Handicapper Think Tank

Reply
 
Thread Tools Display Modes
Old 07-22-2008, 01:41 AM   #1 (permalink)
Rufus
SBR Rookie
 
Join Date: 03-28-08
Location: Las Vegas, NV
Posts: 40
Rufus is offline
Default data/programming/updating model question

Hi all,

I have been working on modeling MLB for most of a year and have a (primarily econometric) model that I can confidently say should be a long-term winner. Problem is, I'm not much of a programmer. I did take an Intro Programming course (IN JAVA) in my last semester of school (last spring) and I can program pretty well in Stata (the statistical program I use), but the real problem is updating the model every day with new data. I've written code to generate a prediction (with the two teams and pitchers as inputs) but I'll need to add data new game data every day.

My question is this: Do any of you/have any of you faced this issue? What is your solution? Do I pretty much need to learn to write a webcrawler using Perl to scrape data offline?

Any help/advice would be greatly appreciated.

Last edited by Rufus : 07-22-2008 at 03:51 AM. Reason: specification
Reply With Quote
Old 07-22-2008, 02:10 AM   #2 (permalink)
Sinister Cat
SBR Wise Guy
 
Join Date: 06-03-08
Posts: 743
Sinister Cat is online now
Default

Yes, scrape the data. I use Tcl to do it-- a lot easier than Perl. Python or Ruby would be other choices. Perl is popular for this kind of thing too but probably more difficult for a novice programmer.
Reply With Quote
Old 07-22-2008, 04:33 AM   #3 (permalink)
Ganchrow
Moderator
 
Ganchrow's Avatar
 
Join Date: 08-28-05
Location: Forest Hills, NY, Home of the Blitzkrieg Bop
Posts: 4,746
Ganchrow is offline
Default

I'm personally partial to Perl, in which I probably do close to 90% of my programming. If you have experience with Java you should have absolutely no problem with Perl.

You might also want to look into hiring programming help off of rentacoder.com or a similar site.
__________________
Reply With Quote
Old 07-22-2008, 12:02 PM   #4 (permalink)
Rufus
SBR Rookie
 
Join Date: 03-28-08
Location: Las Vegas, NV
Posts: 40
Rufus is offline
Default

Any good book you would recommend to learn Perl?
Reply With Quote
Old 07-22-2008, 12:05 PM   #5 (permalink)
Ganchrow
Moderator
 
Ganchrow's Avatar
 
Join Date: 08-28-05
Location: Forest Hills, NY, Home of the Blitzkrieg Bop
Posts: 4,746
Ganchrow is offline
Default

Quote:
Originally Posted by modelman View Post
Any good book you would recommend to learn Perl?
The O'Reilly Learning Perl and Programming Perl books are very user-friendly.
__________________
Reply With Quote
Old 07-22-2008, 12:06 PM   #6 (permalink)
durito
SBR Hall of Famer
 
durito's Avatar
 
Join Date: 07-03-06
Location: La Selva Lacandona
Posts: 5,005
durito is online now
Default

I have a programmer I hired through rentacoder.com

He's pretty cheap, but the work isn't quite what i want. If i can ever get my brain to work again, I'm going to try and learn again myself.
Reply With Quote
Old 07-22-2008, 12:25 PM   #7 (permalink)
durito
SBR Hall of Famer
 
durito's Avatar
 
Join Date: 07-03-06
Location: La Selva Lacandona
Posts: 5,005
durito is online now
Default

Quote:
Originally Posted by Ganchrow View Post
The O'Reilly Learning Perl and Programming Perl books are very user-friendly.
Ordered. Finding out that amazon can deliver to Colombia has not been good for my spending habits.
Reply With Quote
Old 07-22-2008, 01:09 PM   #8 (permalink)
Data
SBR Wise Guy
 
Data's Avatar
 
Join Date: 11-27-07
Location: U.S.S. Enterprise NCC-1701-E
Posts: 986
Data is online now
Default

Quote:
Originally Posted by durito View Post
Ordered. Finding out that amazon can deliver to Colombia has not been good for my spending habits.
This is a cheaper way:
http://proquest.safaribooksonline.com/

You read the books online (or save them on your computer as PDFs). A great time saving benefit, you get all the sample code in downloadable files.
Reply With Quote
Old 07-22-2008, 01:18 PM   #9 (permalink)
MrX
SBR MVP
 
MrX's Avatar
 
Join Date: 01-10-06
Location: Kakapoopoopeepeeshire
Posts: 1,269
MrX is offline
Default

If you have the time, I'd definitely recommend learning enough to write your own scrapers.

Occasionally the site you're scraping from will make a slight change to their format, or some aspect of a report will be different enough from the norm to throw off your scraper and it's sure nice to be able to make changes on the fly instead of waiting for your programmer.

As a side note, I scrape most of my data from MLB.com and they have remained blessedly consistent for a couple of years.
__________________
I'm completely in favor of the separation of Church and State. My idea is that these two institutions screw us up enough on their own, so both of them together is certain death. --George Carlin
Reply With Quote
Old 07-22-2008, 01:52 PM   #10 (permalink)
Rufus
SBR Rookie
 
Join Date: 03-28-08
Location: Las Vegas, NV
Posts: 40
Rufus is offline
Default

Thanks everyone! I really appreciate the help.
Reply With Quote
Old 07-22-2008, 01:55 PM   #11 (permalink)
Justin7
Moderator
 
Justin7's Avatar
 
Join Date: 07-31-06
Posts: 2,341
Justin7 is offline
Default

I paid a programmer to write a scraper in Perl. It would automatically download stats from USAToday every day.
Reply With Quote
Old 07-22-2008, 02:56 PM   #12 (permalink)
Rufus
SBR Rookie
 
Join Date: 03-28-08
Location: Las Vegas, NV
Posts: 40
Rufus is offline
Default

Quote:
Originally Posted by Justin7 View Post
I paid a programmer to write a scraper in Perl. It would automatically download stats from USAToday every day.
How much would that sort of thing cost?
Reply With Quote
Old 07-22-2008, 04:38 PM   #13 (permalink)
rsigley
SBR Hustler
 
Join Date: 02-23-08
Posts: 70
rsigley is offline
Default

i use php, works pretty well - never had a problem

and i use windows scheduler to run it once a day at 7am and input it into mysql db

also for mlb you can just use dougstats, he updates once a day though i notice he's missing a couple players (like e. gonzalez from the padres)
Reply With Quote
Old 07-22-2008, 06:02 PM   #14 (permalink)
Rufus
SBR Rookie
 
Join Date: 03-28-08
Location: Las Vegas, NV
Posts: 40
Rufus is offline
Default

I just looked at dougstats. It seems pretty good, except I need the game-by-game stats since I don't use uniform weights. I normally get it from baseball-reference (I have a subscription so I can use the Play Index) but it's pain in the ass to copy and paste it all.
Reply With Quote
Old 07-22-2008, 06:52 PM   #15 (permalink)
Rufus
SBR Rookie
 
Join Date: 03-28-08
Location: Las Vegas, NV
Posts: 40
Rufus is offline
Default

What database/statistical software do other people use? Being an econ major in college, I learned Stata, which works well for me once I get data into it. I can do all the regressions, statistical analysis, and data management. Anybody else have other database preferences?
Reply With Quote
Old 07-22-2008, 07:02 PM   #16 (permalink)
MrX
SBR MVP
 
MrX's Avatar
 
Join Date: 01-10-06
Location: Kakapoopoopeepeeshire
Posts: 1,269
MrX is offline
Default

Quote:
Originally Posted by modelman View Post
What database/statistical software do other people use? Being an econ major in college, I learned Stata, which works well for me once I get data into it. I can do all the regressions, statistical analysis, and data management. Anybody else have other database preferences?
I find that the statistical functions in Excel 2007 meet most of my needs. I've dabbled in a couple other programs for regression analysis, but not lately.

Mysql for database needs.
__________________
I'm completely in favor of the separation of Church and State. My idea is that these two institutions screw us up enough on their own, so both of them together is certain death. --George Carlin
Reply With Quote
Old 07-22-2008, 08:12 PM   #17 (permalink)
Ganchrow
Moderator
 
Ganchrow's Avatar
 
Join Date: 08-28-05
Location: Forest Hills, NY, Home of the Blitzkrieg Bop
Posts: 4,746
Ganchrow is offline
Default