SOPHIA PATHWAYS Introduction to Statistics milestone 5
Discipline: Statistics
Type of Paper: Question-Answer
Academic Level: Undergrad. (yrs 3-4)
Paper Format: APA
Question
The data below shows the grams of fat for a variety of snacks. Morris
wants to calculate the standard error of the sample mean for this set
of data.
Snack | Grams of Fat |
Snack 1 | 9 |
Snack 2 | 13 |
Snack 3 | 21 |
Snack 4 | 30 |
Snack 5 | 31 |
Snack 6 | 31 |
Snack 7 | 34 |
Snack 8 | 25 |
Snack 9 | 28 |
Snack 10 | 20 |
What is the standard error for this set of data?
RATIONALE
In order to get the standard error of the mean, we can use the following formula: , where is the standard deviation and is the sample size.
Either calculate by hand or use Excel to find the standard deviation, which is 8.31. The sample size is 10 snacks.
The standard error is then:
A researcher has a table of data with 5 column variables and 4 row variables.
The value for the degrees of freedom in order to calculate the statistic is __________.
RATIONALE
Recall to get the degrees of freedom we use df = (r-1)(c-1) where c and r are the number of rows and columns. This means df = (5-1)(4-1) = 4*3 =12.
Rachel measures the lengths of a random sample of 100 screws. The mean length was 2.6 inches, with a standard deviation of 1.0 inches.
Using the alternative hypothesis (µ < µ0), Rachel found that a z-test statistic was equal to -1.25.
What is the p-value of the test statistic? Answer choices are rounded to the thousandths place.
RATIONALE
If we go to the chart and the row for the z-column for -1.2 and then the column 0.05, this value corresponds to 0.1056 or 0.106.
What do the symbols , , and represent?
RATIONALE
Recall that is the sample proportion, is the sample mean, and is the sample standard deviation. Since all of these come from samples they are statistics.
Which of the following is an example of a parameter?
RATIONALE
Recall a parameter comes from the entire set of interest, the
population.
Since they are looking at all members of a community here, their
availability to volunteer would be an example of a parameter.
Amanda is the owner of a small chain of dental offices. She sent out
the yearly satisfaction survey to 600 randomly selected patients and
received 544 surveys back. When looking through the results, she noticed
that the downtown dental office staff had 84% of clients reporting
satisfaction with services, while the uptown dental office staff had 76%
of clients reporting satisfaction with services.
Which of the following sets shows Amanda's null hypothesis and alternative hypothesis?
RATIONALE
Recall that the null hypothesis is always of no difference.
So the null hypothesis (Ho) is that the proportion of patients
satisfied at the uptown clinic = proportion satisfied at the downtown
clinic. This would indicate no difference
between the two groups.
The alternative hypothesis (Ha) is that there is difference in the proportion of patients satisfied between the two groups.
For a left-tailed test, the critical value of z so that a
hypothesis test would reject the null hypothesis at 10% significance
level would be __________. Answer choices are rounded to the hundredths
place.
RATIONALE
Recall that when a test statistic is smaller than in a left tailed
test
we would reject H₀. If we go to the standard normal chart and use 10%
or 0.10, we will search for the closest value to 10% as closely as
possible.
0.1003 corresponds with a z-score of -1.28.
Sukie interviewed 125 employees at her company and discovered that 21 of them planned to take an extended vacation next year.
What is the standard error of the sample proportion? Answer choices are rounded to the thousandths place.
RATIONALE
We can note the SE of the proportion is .
If we note that , which means .
So if we take all this information we can note SE = .
One condition for performing a hypothesis test is that the observations are independent. Mary is going to take a sample from a population of 500 students.
How many students will Mary have to sample without replacement to treat the observations as independent?
RATIONALE
In general we want about 10% or less to still assume independence.
So size = 0.1*N = 0.1(500) = 50
A sample of 50 or less would be sufficient.
Select the statement that correctly describes a Type I error.
RATIONALE
Recall a Type I error is when we incorrectly reject a true null hypothesis. So we would reject H₀ using sample evidence, when in fact it was not true.
Edwin conducted a survey to find the percentage of people in an area who smoked regularly. He defined the label “smoking regularly” for males smoking 30 or more cigarettes in a day and for females smoking 20 or more. Out of 635 persons who took part in the survey, 71 are labeled as people who smoke regularly.
What is the 90% confidence interval for this population proportion? Answer choices are rounded to the hundredths place.
RATIONALE
In order to get the CI we want to use the following form.
First, we must determine the corresponding z*score for 90% Confidence
Interval. Remember, this means that we have 5% for the tails, meaning
5%, or 0.05, for each tail. Using a z-table, we can find the upper
z-score by finding (1 - 0.05) or 0.95 in the table.
This corresponding z-score is at 1.645.
We can know
So putting it together:
The lower bound is:
0.11-0.02 = 0.09
The upper bound is:
0.11 + 0.02 = 0.13
Carl recorded the number of customers who visited his new store during the week:
Day | Customers |
Monday | 17 |
Tuesday | 13 |
Wednesday | 14 |
Thursday | 16 |
He expected to have 15 customers each day. To answer whether the
number of customers follows a uniform distribution, a chi-square test
for goodness of fit should be performed. (alpha = 0.10)
What is the chi-squared test statistic? Answers are rounded to the nearest hundredth.
RATIONALE
Using the chi-square formula we can note the test statistic is
A ball is drawn from a bag of 26 red and 12 yellow balls. The process is repeated 50 times, replacing each ball that is drawn.
Which of the following statements about the distributions of counts and proportions is TRUE?
RATIONALE
Recall that if we look at the counts from a large population of success and failures (2 outcomes), this is called a binomial distribution. For this to be true, success would be a red ball and failure (blue and yellow) would be another.
Edwin conducted a survey to find the percentage of people in an area who smoked regularly. He defined the label “smoking regularly” for males smoking 30 or more cigarettes in a day and for females smoking 20 or more. Out of 635 people who took part in the survey, 71 are labeled as people who smoke regularly.
Edwin wishes to construct a significance test for his data. He finds that the proportion of chain smokers nationally is 14.1%.
What is the z-statistic for this data? Answer choices are rounded to the hundredths place.
RATIONALE
To make things a little easier, let's first note the denominator
We can now note that
Finally, subbing all in we find
*note that if you round, the values can be slightly different.
Joe is measuring the widths of doors he bought to install in an
apartment complex. He measured 72 doors and found a mean width of
36.1 inches with a standard deviation of 0.3 inches. To test if the
doors differ significantly from the standard industry width of 36
inches, he computes a z-statistic.
What is the value of Joe's z-test statistic?
RATIONALE
If we first note the denominator of
Then, getting the z-score we can note it is
This tells us that 36.1 is 2.83 standard deviations above the value of 36.
Note that when you round some values you may get slightly different
results, but the results should be relatively close to this final
calculated value.
Adam tabulated the values for the average speeds on each day of his road trip as 60.5, 63.2, 54.7, 51.6, 72.3, 70.7, 67.2, and 65.4 mph. He wishes to construct a 98% confidence interval.
What value of t* should Adam use to construct the confidence interval? Answer choices are rounded to the thousandths place.
RATIONALE
Recall that we have n = 8, so the df = n-1 = 7. So if we go to the
row
where df = 7 and then 0.01 for the tail probability, this gives us a
value of 2.998. Recall that a 98% confidence interval would have 2% for
the tails, so 1% for each tail.
We can also use the last row and find the corresponding confidence level.
What value of z* should be used to construct a 98% confidence interval of a population mean? Answer choices are rounded to the thousandths place.
RATIONALE
Using the z-chart to construct a 98% CI, this means that there is 1% for each tail. The lower tail would be at 0.06 and the upper tail would be at (1 - 0.01) or 0.99. The closest to 0.94 on the z-table is between 0.9901 and 0.9898.
0.9898 corresponds with a z-score of 2.32.
0.9901 corresponds with a z-score of 2.33.
Taking the average of these two scores, we get a z-score of 2.325.
Joe hypothesizes that the students of an elite school will score higher than the general population. He records a sample mean equal to 568 and states the hypothesis as μ = 568 vs μ > 568.
What type of test should Joe do?
RATIONALE
Since the Hₐ is a greater than sign, this indicates he wants to run a
one-tailed test where the rejection region is the upper or right tail.
This can be called a right-tailed test.
The data below shows the grams of fat in a series of popular snacks.
Snack | Grams of Fat |
Snack 1 | 9 |
Snack 2 | 13 |
Snack 3 | 21 |
Snack 4 | 30 |
Snack 5 | 31 |
Snack 6 | 31 |
Snack 7 | 34 |
Snack 8 | 25 |
Snack 9 | 28 |
Snack 10 | 20 |
If Morris wanted to construct a one-sample t-statistic, what would the value for the degrees of freedom be?
RATIONALE
The degrees of freedom for a 1 sample t-test are df=n-1 where n is the sample size. In this case, n=10, then df = n-1 = 10-1 = 9.
A table represents the number of students who passed or failed an aptitude test at two different campuses.
East Campus | West Campus | |
Passed | 48 | 37 |
Failed | 52 | 63 |
In order to determine if there is a significant difference
between campuses and pass rate, the chi-square test for association and
independence should be performed.
What is the expected frequency of West Campus and failed?
RATIONALE
In order to get the expected counts we can note the formula is:
Which of the statements about one-way ANOVA is FALSE?
RATIONALE
When we do the one-way ANOVA we are trying to examine if the means of
multiple groups are equal or not. We aren't testing independence of
the variables, that is what we do with a chi-square test for
independence.
Mike tabulated the following values for heights in inches of seven of
his friends: 65, 71, 74, 61, 66, 70, and 72. The sample standard
deviation is 4.577.
Select the 95% confidence interval for Mike's set of data.
RATIONALE
In order to get the 95% CI , we first need to find the critical t-score. Using a t-table, we need to find (n-1) degrees of freedom, or (7-1) = 6 df and the corresponding CI.
Using the 95% CI in the bottom row and 6 df on the far left column, we get a t-critical score of 2.447.
We also need to calculate the mean:
So we use the formula to find the confidence interval:
The lower bound is:
68.43 - 4.23 = 64.20
The upper bound is:
68.43 +4.23 = 72.66
*note that when
rounding you can get values that might be slightly different but the
values should be very close to what is calculated here.
Adam tabulated the values for the average speeds on each day of his road trip as 60.5, 63.2, 54.7, 51.6, 72.3, 70.7, 67.2, and 65.4 mph. The sample standard deviation is 7.309.
Adam reads that the average speed that cars drive on the highway is 65 mph.
The t-test statistic for a two-sided test would be __________. Answer choices are rounded to the hundredths place.
RATIONALE
Using the information given, we need to find the sample mean:
We now know the following information:
Let's plug in the values into the formula:
The table below shows the results of a customer satisfaction survey at a particular restaurant broken down by males and females.
Male | Female | |
Extremely Satisfied | 25 | 7 |
Satisfied | 21 | 13 |
Neutral | 13 | 16 |
Dissatisfied | 9 | 14 |
Extremely Dissatisfied | 2 | 5 |
Assuming all 5 choices are equally likely, select the observed and expected frequency for male customers that are dissatisfied.
RATIONALE
If we simply go to the chart then we can directly see the observed frequency for male customers who are dissatisfied is 9.
To find the expected frequency, we need to find the number of
occurrences
if the null hypothesis is true, which in this case, was that the five
options are equally likely, or if the five options were all evenly
distributed.
First, add up all the options in the Male column:
If each of these five options were evenly distributed among the 70 males, we would need to divide the total evenly between the five options:
This means we would expect 14 men to be extremely satisfied, satisfied, neutral, dissatisfied, and extremely dissatisfied.