use data (attached) to solve the following questions using Excel
(1)
Eliminate duplexes and properties with prices over \$850,000 from the data. Eliminate non- numeric variables and redundant variables from the data.
(2)
Which variable correlates most strongly with price?
(3) Find the regression line
Y
=
?0
+
?1x
with the variable chosen in the previous problem. [The lm function in R or the Analysis ToolPak add-in for Excel will do ]
For the remaining problems, consider the following variables associated with each property.
x1
= number of bedrooms
x2
= number of bathrooms
x3
= number of stories
x4
= square footage
x5
= house has pool?
(4) Construct the multivariable least squares model with predictors
x1, x2, x3, x4, x5. [First, con- vert
x5
to binary.]
(5) Use a hypothesis test to determine if the model is useful for predicting home values at a level
?. State the
p-value and interpret.
(6) Are any variables not useful predictors of home price at significance level
?
= 0.05? State the
p-values of any rejected variables. What does this mean practically?

