Descriptive Results, analysis and conclusions in a word document using R
EXCEL FILE ATTACHED (NCBIRTH800) for Question 1 and 2 only
Question 1
In the attached file NCBIRTH800 there is a random sample of 800 subjects from the North Carolina birth registry
.
1. Create a table that cross-tabulates the counts of mothers in the classifications of whether the baby was premature or not (PREMIE) and whether the mother admitted to smoking during pregnancy (SMOKE) or not.
(a) Find the probability that a mother in this sample admitted to smoking.
(b) Find the probability that a mother in this sample had a premature baby.
(c) Find the probability that a mother in the sample had a premature baby given that the mother admitted to smoking.
(d) Find the probability that a mother in the sample had a premature baby given that the mother did not admit to smoking.
(e) Find the probability that a mother in the sample had a premature baby or that the mother did not admit to smoking.
Question 2
Using the data in excel file NCBIRTH800, answer the following questions:
a. For the continuous variables specified in unit 1 assignment, assume a Normal distribution and that the portions of the data within plus or minus one standard deviation of the mean are considered typical (68.26% of the data), give the lower and upper bounds of this range for each variable.
b. Continuing with part a, and assuming that values outside the central 95% of the data are considered unusual, give the lower and upper bounds of this range for each variable.
c. For the variable TGRAMS in the unit 1 assignment, you computed numeric and descriptive statistics. Repeat this but now using category SMOKE, compare the values of TGRAMS to those of TGRAMS by each level of category of SMOKE and comment on the differences and similarities.
d. Using the results of part c, repeat the part a distribution for TGRAMS for each category of SMOKE and compare to the original.
e. Using the results of part c, repeat the part b distribution for TGRAMS for each category of SMOKE and compare to the original.
f. Optionally, when a patient gets their bloodwork done, and is given a low to high range for cholesterol or vitamin D, is this based on any of the distributions just covered?
g. Generate two Histograms for TGRAMS based on the values of SMOKE and two Distribution Plots for the part a, and part b distributions.
Question 3
In the study of fingerprints, an important quantitative characteristic is the total ridge count for the 10 fingers of an individual. Suppose that the total ridge counts of individuals in a certain population are approximately normally distributed with a mean of 140 and a standard deviation of 50. Find the probability that an individual picked at random from this population will have a ridge count of:
(a) 200 or more
(b) Less than 100
(c) Between 100 and 200
(d) Between 200 and 250
(e) In a population of 10,000 people how many would you expect to have a ridge count of 200 or more?