Prior to starting a statistical analysis, it is important to screen and clean the data, along with testing the specific assumptions for the statistical test you plan to conduct.
Assignment Instructions: Please use the Week 5 Assignment Template to complete this assignment.. Also, you can use the Week5_DataSet.sav data set to complete the following.
Part 1:
Once you have Week5_DataSet.sav opened in SPSS, please review and screen the data for errors. You will want to assess whether there are outliers (abnormal data points) or missing values. You don’t need to use any techniques in SPSS to do this- you can just look through the values! Next, write up a report regarding what you found (3-4 sentences). For example, “The participant with an ID number of XXXX had a missing value for the XXXX variable.”
What ways could you correct the data from the errors that were found? Please note, you do not need to actually correct the data, but only provide some suggestions. Provide one resource to support your explanation.
Discuss why data cleaning and screening is important. Provide one resource to support your explanation.
Part 2:
Normality is a common assumption for many statistical tests. Please assess the normality of the “Height” variable. There are many ways to assess normality. However, for this assignment, please use these two approaches: 1) visual interpretation of a histogram and 2) assessing the Shapiro-Wilk test. Please include the following three SPSS outputs in your document for the “Height” variable: the histogram, the descriptive statistics table, and the Shapiro-Wilk table.
Next, for each of the three SPSS outputs, assess whether the distribution has met the assumption of normality. For example, “From the histogram, you can see that XXXX. This indicates that normality has/has not been met.”
Linearity is also a common assumption for many statistical tests. Linearity is assessed using scattergram and measures whether two variables are linearly related. Using the Week5_DataSet.sav data file, create two scattergrams. The first scattergram will be between the following two variables: “Height” and “Weight.” The second scattergram will be between “Age” and “Height.” Please include the two scattergrams in your document. Hint: use the “simple scatter” option.
Next, for each scattergram, assess whether the assumption of linearity has been met. Include an explanation for your decision. For example, “Height and Weight showed/did not show a linear relationship. This was determined due to X, Y, and Z.”
Discuss why testing statistical assumptions is needed. Provide one resource to support your explanation.
Length: 4–5 pages (in addition to title page and references)
References: Include a minimum of 3 scholarly resources