Housing Data File
This data set contains 25 variables that are described in the data dictionary tab and include
several categorical variables, some binary variables, and numerical data. You can approach this
data from a variety of perspectives using the techniques you learned in class to answer
questions. Use the tool of your choice but make sure you know how to use it correctly. There
are 2930 observations, more than enough to provide valid and reliable statistical analysis.
You have been hired by the local real estate broker to analyze activity in the local housing
market. You must conduct three ANOVA analyses, a correlation analysis, and three Regression
analyses to answer six questions you believe will help the broker provide the best guidance to
both buyers and sellers. Categorical variables include building type, neighborhood, and house
style. If you think it is important to understand the age of the house when it sold, you will need to
create a new variable (year built and year sold are two variables provided). As with any project,
you will start with EDA to get a sense of your data. For categorical variables, using a pivot table
to get counts and proportions is an excellent way to get a better understanding of those
variables. Example questions could focus on house prices in different neighborhoods, those
sold in different years, impact of lot size or house square footage on price, etc.
Perform EDA and include information about the data in the report that helps the reader get an
understanding of the data set (useful graphics should be included). Develop six research
questions (ideas include regression model to predict price, ANOVA to compare a numerical
variables based upon categorical variable groups, etc.). Perform the analysis and write a
detailed description of the results and what they mean (how you would use them). Be sure to
include appropriate graphics.
Create a 4-5 slide presentation that would support a very brief presentation (3 minutes) of your
below is a sample paper in the files uploades. paper does not need to be 24 pages, but I have include this so you can get a sense of the overall flow of the document and what you should include.
the attachments consists of the required dataset and a sample paper and please use excel with data analytics package for the assignment and please provide the excel files also