Call:
lm(formula = median_house_value ~ average_household_income +
average_house_age + ocean_proximity, data = housing_clean)
Residuals:
Min 1Q Median 3Q Max
-721614 -139275 -34274 92332 1458432
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.344e+05 9.244e+03 -36.180 < 2e-16 ***
average_household_income 6.998e+00 5.602e-02 124.919 < 2e-16 ***
average_house_age 1.066e+04 2.141e+02 49.818 < 2e-16 ***
ocean_proximityINLAND -6.458e+04 3.782e+03 -17.076 < 2e-16 ***
ocean_proximityNEAR BAY 1.284e+05 6.397e+03 20.080 < 2e-16 ***
ocean_proximityNEAR OCEAN 3.286e+04 4.977e+03 6.603 4.13e-11 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 221800 on 21429 degrees of freedom
Multiple R-squared: 0.5083, Adjusted R-squared: 0.5082
F-statistic: 4431 on 5 and 21429 DF, p-value: < 2.2e-16
Model
Final Regression Equation
The fitted linear model is:
\[ \text{HouseValue} = -334{,}429.74 + 7.00 \cdot \text{Income} + 10{,}664.29 \cdot \text{Age} - 64{,}580.69 \cdot \text{Inland} + 128{,}444.13 \cdot \text{NearBay} + 32{,}864.60 \cdot \text{NearOcean} \]
Where: - Income
= average household income (in dollars)
- Age
= average house age (in years)
- Inland
, NearBay
, and NearOcean
are binary variables:
- 1 if the house is in that region, else 0
- The baseline is <1H OCEAN
Example Prediction
Suppose a house has: - Income = $90,000
- Age = 30 years
- Ocean proximity = “NEAR BAY” (i.e., NearBay = 1
)
Then:
\[ \text{HouseValue} = -334{,}429.74 + (7.00 \cdot 90{,}000) + (10{,}664.29 \cdot 30) + 128{,}444.13 = 1{,}191{,}701.59 \]
Interpretation
- Every $1 increase in income → +$7 in house value
- Every additional year of age → +$10,664.29 in value
- Compared to
<1H OCEAN
:INLAND
homes are worth ~$64.6K less
NEAR BAY
homes are worth ~$128.4K more
NEAR OCEAN
homes are worth ~$32.8K more