2.1. Linear RegressionFor most of this book, we'll be working on applications that do linear regression, a simple but informative statistic. Suppose that you have a series of data pairs, such as the quarterly sales figures for a particular department, shown in Table 2.1.
A regression line is the straight line that passes nearest all the data points (see Figure 2.1). The formula for such a line is y = mx + b, or the sales (y) for a given quarter (x) rise at a quarterly rate (m) from a base at "quarter zero" (b). We have the x and y values; we'd like to determine m and b. Figure 2.1. The sales figures from Table 2.1, plotted in a graph. The line drawn through the data points is the closest straight-line fit for the data.The formulas for linear regression are as follows: The value r is the correlation coefficient, a figure showing how well the regression line models the data. A value of 0 means that the x and y values have no detectable relation to each other; ±1 indicates that the regression line fits the data perfectly. Linear regression is used frequently in business and the physical and social sciences. When x represents time, lines derived from regressions are trends from which past and future values can be estimated. When x is volume of sales and y is costs, you can claim b as fixed cost and m as marginal cost. Correlation coefficients, good and bad, form the quantitative heart of serious arguments about marketing preferences and social injustice. The demands on a program for turning a series of x and y values into a slope, intercept, and correlation coefficient are not great: Keep a running total of x, x2, y, y2, and xy; keep note of the count (n); and run the formulas when all the data has been seen. |
Friday, October 16, 2009
Section 2.1. Linear Regression
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment