Monday, July 27, 2009

School Distance and Home Prices

The rationale for this study is to establish a negative correlation between property values and the distance from public schools. Many determinants are used when evaluating the price of a home. This study uses stepwise linear regression to establish school distance as a determinant of the value of a piece of property. Some questions that can be addressed through this statistical study are: Does distance from a public school affect the value of a home? How much (in dollar terms) is the value affected? Is there strong enough statistical evidence to indicate a negative correlation between house prices and school distance? These questions can, and are, answered through this statistical study.

The results of the stepwise-stepwise linear regression analysis confirm the hypothesis that home values are negatively correlated with distance from schools. The variables that were entered/removed from the model because they met the level of significance threshold were area, number of bathrooms, four miles, and days on market. The fourth model produced included area, number of bathrooms, four miles, and days on market as the predictors of sales price. The r square for the fourth model was .705 and the adjusted R square was .702. This means that the fourth model produced explains 70.5% of the sales price. The adjusted r square decreased by .03, which is a solid indicator of confidence in the numbers. When looking at the ANOVA table, the F statistic for the fourth model is 202.420 with a p-value of .000. The F statistic is very high and has a very low p-value, thus indicating strong statistical significance. The t-statisitc for the independent variable of four miles in the fourth model is 3.175, which indicates a model that possesses considerable strength and aptness. The coefficients of the fourth model show that a home located three to four miles away from the appropriate high school capitalize a $24,281.52 discount into the price. When looking at the collinearity statistics, a variance inflation factor (VIF) for the fourth model’s variable of four miles is 1.040. This is very low and indicates that there is no multi-collinearity occurring between the independent variables. All four statistically significant independent variables have VIF’s of 2.396 or less. The rule of thumb when determining multi-collinearity between independent variables is that VIF’s less than 10 show that multi-collinearity is not an issue in the model. Any VIF of 10 or more indicates multi-collinearity. Therefore, multi-collinearity is not an issue in the model. The variables that were excluded from the fourth model because they did not meet the statistical level of significance were number of bedrooms, name of highschool, one mile, two mile, and three mile.

1 comment:

Will Dwinnell said...

"The rule of thumb when determining multi-collinearity between independent variables is that VIF’s less than 10 show that multi-collinearity is not an issue in the model. Any VIF of 10 or more indicates multi-collinearity."

Why 10, specifically?