STAT 1100 - Midterm Exam 1
Sample Problems
1. The Cable News Network (CNN) is interested in finding out how many U.S. citizens are planning on voting in the upcoming election. They post a poll on their website for 2 weeks. At the end of 2 weeks, the results show that, out of the 9,423 people who responded to the poll, 8,112 are planning on voting in the midterm election.
a. What is the population of interest?
i.CNN viewers
ii.CNN employees
iii.U.S. citizens
iv. Midterm election candidates
v.9,423 people who responded to the poll
b. What is the sample?
i.CNN viewers
ii. CNN employees
iii.U.S. citizens
iv.Election candidates
v. 9,423 people who responded to the poll
c. Is the variable quantitative or categorical?
i.Quantitative variable
ii.Categorical variable
d. What is the best type of graph to use to display this variable?
i.Scatterplot
ii.Boxplot
iii.Bar graph
iv. Side-by-side bar graph
v.Side-by-side boxplot
e. What type of study design was used to collect the data?
i.Retrospective observational study
ii. Prospective observational studyiii. Experiment
iv. Sampling survey
f. How would you describe the statistic that was collected from the poll? Calculate it. Round your answer to 2 decimal places.
g. Identify any potential bias with this survey.
i. Complicated questions
ii. Vague concepts
iii. Leading questions
iv. Central tendency bias
v. Error prone response options
vi. Voluntary response bias
2. The Internal Revenue Service (IRS) decides that it is going to randomly sample tax returns this year to choose taxpayers to audit. It picks a random, representative sample from the following groups: single person tax filers, married filers, married with children filers, small business owners, and large business owners. What type of sampling method is used?
a. Simple random sample.
b. Systematic sample.
c. Cluster sample.
d. Stratified random sample.
e. Convenience sample.
f. Voluntary sample.
3. At the end of the STAT 1100 course, the professor asks students to answer the following question on their final, “What significant changes have you seen in your analytical skills after taking this course?” What type of bias is present?
a. Sampling bias.
b. Nonresponse bias.
c. Response bias.
4. The Acme Drug company needs to test their new diet drug, Meltaway, in order for the drug to gain Food and Drug Administration (FDA) approval. They recruit 100 volunteers for their study.First, the volunteers are split by sex, then each sex is randomly assigned to either the treatment group or the control group which receives a placebo sugar pill. The subjects’ diets are otherwise monitored and identical in all 4 groups for 1 month. At the end of one month, the weight losses or gains are recorded for each group.
a. Out of the following possible experiment characteristics, which ones are present in this study?
i. Random sampling
ii. Blocking
iii. Randomization to treatments
iv. Control Group
v. Double-Blind
vi. Replication
b. In the study, what variable would be the explanatory variable and what is the variable that would be measured as the response variable?
5. The Pittsburgh Post-Gazette reports: “The male vs female gap has widened when it comes to hygiene, according to the latest stakeout by the ‘hand washing police.’ One-third of men didn’t bother to wash after using the bathroom, compared with 12 percent of women, said the researchers who spy on people in public restrooms... Two years ago, the last time the survey was done, only one-quarter of men didn’t wash, compared with 10 percent of women... The latest study was based on observations last month of more than 6,000 people in four cities.”
a. Which two of these are the explanatory variables?
i. Sex
ii. Washing hands
iii. City
iv. Year
v. Month
vi. Number of people
b. Which one is the response variable? i. Sex
ii. Washing hands
iii. City
iv. Year
v. Month
vi. Number of people
c. For the purpose of this study, researchers most likely positioned themselves in the designated restrooms and observed a group of individuals who used the facilities over a certain period of time. What type of sampling is this?
i. Simple random sample.
ii. Systematic sample.
iii. Cluster sample.
iv. Stratified random sample.
v. Convenience sample.
vi. Voluntary sample.
6. Spur lengths, in centimeters, of native orchid species in Madagascar averaged 15.9 for one group of 10 orchids, 4.5 for a second group of 10, 13.5 for a third group of 10, and 15.6 for a fourth group of 10. Find the average spur length of all 40 orchids.
7. Molar lengths (in millimeters) for 13 specimens of early Homo sapiens have five number summary values 8.5, 8.9, 9.1, 9.65, 10.7. A histogram would show a gap separating the highest value, 10.7, from the rest of the data. Use the IQR Rule to determine if 10.7 should technically be considered a high outlier.
8. The average height of a man in the United States is 70 inches, with a standard deviation of 2.65 inches, while the average height of a woman in the United States is 64 inches, with a standard deviation of 2.5 inches.
a. Calculate the ?-score of each to find out who is taller, relative to their group: a man who is 76 inches tall, or a woman who is 72 inches tall?
b. Are either of these people considered to be outliers?
9. Here are side-by-side boxplots for ages of men-seeking-women and women-seeking-men advertised on Pittsburgh’s Craig’s List.
a. For which of these would 9 be a reasonable guess? (You can circle anywhere from none to all four)
i. minimum value for men
ii. minimum value for women
iii. IQR for men
iv. IQR for women
b. The shapes are best described as:
i. both skewed left
ii. men skewed left, and women skewed right
iii. men skewed right, and women skewed left
iv. both skewed right
c. Apparently, Q3 + 1.5(IQR) for the women is approximately (estimate to the nearest 5 years): ______________
d. Suppose we’d like to use the data to draw conclusions about the mean age of men-seeking-women and women-seeking-men singles who are seeking a partner. One problem is that people tend to lie about their age. This suggests we may have
i. A non-representative sample
ii. An inaccurate assessment of the sampled values
10. The paired data below consists of heights and weights of six randomly sampled students:
Height in inches (?): 68 69 69 71 71 74
Weight in lbs. (?): 130 190 172 170 195 235
a. Make a scatterplot of the data.
b. Calculate the correlation coefficient. Round to 4 decimal places.
c. Find the line for regressing weight on height. Round to 1 decimal place.
d. Predict the weight for a person of height 72 inches.
e. Find ?! and state what it means in this problem. Round to 3 decimal places.
f. Explain the meaning of the slope in this context.
11.Do MBA graduates of a business school earn a higher starting salary if they have moreprior work experience? The figure below is a scatterplot of salary (in thousands of dollars) vs.experience (in years) for 51 MBA graduates.
a. Say in words what a positive association between experience and salary would mean. Does the plot show a positive correlation?
b. What is the form of the relationship? Is it roughly linear? Is it strong? Explain your answers.
c. An obvious outlier corresponds to someone with a relatively low salary in spite of many years of experience. Tell the approximate salary and years of experience for that individual.12. Using the data from the plot of the 51 experience-salary pairs, the summaries for ?
(experience) and ? (salary) are: ? = 4.7, ?" = 3.6, /? = 47.5, ?# = 7.5, ? = 0.7
a. What is the equation of the least squares line for predicting salary from experience?
b. What percent of the observed variation in salaries can be explained by the linear relationship between salary and experience?
c. One individual had a salary of 62 thousand dollars with only 7 years of experience.
What is the predicted salary for this many years of experience? What is the residual for this individual?
13. Each of the following statements contains a mistake. Explain in each case what is wrong.
a. There is a high correlation between the sex of American workers and their income.
b. We found a high correlation (? = 1.02) between students' ratings of faculty teaching and ratings made by other faculty members.
c. The correlation between planting rate and yield of corn was ? = 0.23 bushels.
14. A study about weight-gain during pregnancy included background information on race and marital status. A two-way table of the data can be used to explore the relationship between these two variables.
a. What is the probability that a woman in this sample is married?
b. What is the probability that a woman in this sample was married, given that they were Asian?
c. Are the events “Married” and “Asian” from this sample of pregnant women independent or dependent? Explain.
d. What is the probability that a woman was Caucasian and not married?
e. What is the probability that a woman was Caucasian or not married?
15. A coin is flipped 4 times. Find the following probabilities:
a.) Flip one heads
b.) Flip no heads
c.) Flip at least one heads