View New Posts
1. ## Question about Regression Analysis

Good afternoon.

As most of us know, regression analysis is used when you want to predict a continuous dependent variable from a number of independent variables.

In addition, the two data matrices involved in regression are usually denoted X and Y, and the purpose of regression is to build a model Y = f(X). Such a model tries to explain, or predict, the variations in the Y-variable(s) from the variations in the X-variable(s). The link between X and Y is achieved through a common set of samples for which both X- and Y-values have been collected.

Hypothetically speaking, lets say I was able to gather data, do the regression analysis and get my "answers". What do these "answers" in the form of numbers tell you, in understandable language.

I am unsure if I am even asking the right question, but any enlightenment would be much appreciated.

Also, I am looking for an example of regression analysis used in baseball. Any direction on where I can go to see an example would also be much appreciated.

Thank you.

2. 1. You definitely need to take a course on it or get a good book if you have to answer this question.

2. A linear regression is Y = f(X) = b0 + b1*X1 + b2*X2 + ... + bn*Xn + error where there are different assumptions on the behavior of the error term depending on the model. 'Running a regression' will give you estimates of the b's where bn is interpreted as a partial derivative of Y with respect to Xn.

Anyone can do this but you need training to know when to believe your estimate of b.

3. You are right. Thanks for steering me in the right direction

4. The relationship is one thing, but the R^2 (R squared) value is another. Similar to a correlation coefficient, it tells you how "close" the relationship is. Absolute values above about 0.6 or 0.7 show stronger relationships, values below can tend toward randomness depending upon how many data points are in the regression.

Next thing to pay attention to is residuals, they should scatter more or less uniformly around the regression line, if they do not, then your relationship is not linear and another type of regression type should be used.