PHP 2510 Take Home Final Exam


Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due


PHP 2510 Take Home Final Exam

  • Please read the instructions carefully.
  • For all hypothesis tests, please state your level of significance.
  • Resources: You may consult the course web site, your notes, and the course textbook.
  • Questions: You may consult with the instructor (Dr. Dunsiger) and TE (Bill Nardi) for clarifications. The projectis intended to be a real experience in the role of the biostatistician. Since science is a team sport, we have made the option of completing this project in a small group (groups of 2-3). You are also more than welcome to complete independently. There is not one right answer or one right approach. This is about the process.
  • Submission: Submit the complete exam throughCanvas. If you are completing in a group, please submit one copy for your group (with all team member’s names on page 1).
    • This project is due on Friday December 13, 2024, before 11:59pm.1

STUDY DESCRIPTION

Since 2003, the National Cancer Institute has administered the nationally representative Health Information National Trends Survey (HINTS) every few years. The HINTS target population is all adults aged 18 or older in the civilian noninstitutionalized population of the United States. The HINTS program collects data on the American public's need for, access to, and use of health-related information and their health-related behaviors, perceptions, andknowledge.

Data collection for HINTS 6 was conducted between March 7 and November 8, 2022. One adult within each sampled household was selected using the next-birthday method. In this method, the adult who would have the nextbirthday in  the sampled household was asked to complete the questionnaire. All households received a $2 incentive toencourage participation. The overall household response rate using the next-birthday method was 28.1%.

The final HINTS 6 sample consisted of 6,252 respondents. The data set you use in this course is a modified version of this sample. You will find a codebook at the end of this document.

Source: National Cancer Institute Health Information National Trends Survey: Data

Research Questions

For the purpose of this project, you will answer one of the following research questions.

1. Are there sociodemographic differences in the use of electronic medical records among individuals with depression?

2. Are there sociodemographic differences in the use of electronic medical records among individuals with diabetes?

3. Are there sociodemographic differences in the use of electronic medical records among individuals with high blood pressure?

You need not choose all sociodemographic variables in this analysis. Decide which ones might be meaningful and pursue those. The only requirement is that you include at least 2.

INSTRUCTIONS:

1. Perform exploratory data analysis (EDA) to describe the information (variables) collected in this study. [Note: A complete EDA should include both suitable descriptive statistics, and plots.] You need not summarize every variable in the data set; choose the ones you think are meaningful. EDA will likely include some combination of descriptives, graphs, univariate and bivariate tests.

2. Choose the most appropriate statistical approach to answer the primary research question.

3. Summarize your quantitative findings from parts 1in one well-organized (meaningful and easy to understand structure of columns and rows), self-explanatory (properly labeled, etc.) table and up to two graphs.

4. Create one-two table(s) to summarize the results from the analysis you conducted in part 2 to answer the primary research question of this study. Again, make sure that the table(s) is well-organized,self explanatory. You will also summarize in words, but part of the challenge is creating meaningful table(s).

SUBMISSION:

1. Final Project Write-up:

Prepare a brief summary of all the relevant analyses conducted, results and conclusions. The write-up

should:

a. be in the form of a pdf or a word document. If you use Rmarkdown for the project, please be sure to knit to a word or pdf ahead of submission.

b. not exceed 5 pages double spaced. Note: there is no need for it to be 5 pages – 1 or 2 is just fine as well. But the max is 5 pages. I know this can be hard, but part of good data analysis is finding ways to summarize your findings in brief. Note that this page limit includes tables/graphs.

c. be organized in four main sections, namely the “Introduction”, “Methods”, “Results”, and “Discussion”. There is no required length for any one of these sections. The goal is simply to help you organize your report. In all cases, paragraph form or bullet points is fine. Just make sure it explains to the reader (me!) in enough detail.

i. Introduction: Briefly describe the study and research question of interest. This should include some rationale as to why this research question is important. Be sure to state your hypothesis(es) as well.

ii. Methods: Description of the statistical analyses that you have performed. This is where you explain what you did. This includes the EDA, any exploratory analysis and the statistical plan you have chosen to answer the primary research question. Be sure to state your significance level if you are relying on hypothesis testing.

iii. Results: Overall description of the results including describing the study sample (tell me about the participants), results from EDA and other statistical analyses performed for the purposes of this project. Finally, be sure to answer the primary research question. You can imbed any figures/tables in the results or have them all listed at the end of the report. In either case, be sure they are labelled and easy to find. Reminder: if you don’t write about a table/figure, it shouldn’t be there.

iv. Discussion: Discuss the main findings from your analysis with emphasis on the research question. In other words, what did you find, did you expect these results? Included in the discussion please specify any weaknesses, or limitations to the analysis or data collected.

Please also include a statement of future directions (namely, what follow-up analyses should be done in order to move the science forward).

2. R-code:

Please submit the R code you used to complete this project. Please make sure the code is properly annotated to help the reader understand which part of the code corresponds to which part of the data analysis.

GRADING:
1. The final project will be evaluated based on:

a. Correctness of the content and approaches used for answering the research questions.

b. Completeness.

c. Presentation of results (for example, a table that isn’t labeled would result in point deductions). It is tempting to try everything and include everything. This is strongly discouraged. Make a plan, run the analyses, report the results. Please be sure to include details of any assumptions you made or choices when it comes to cleaning the data. The idea is that the report can be reproducible.

d. Neatness. Part of the lesson in this final project is presentation of the results. This means creating (for example) readable tables that are easily accessible to your audience. Note that you will NOT receive full credit by directly copying and pasting any output from the software. You can summarize the output in the text or create a readable table to display the output.

TOTAL 95 points: 10 introduction, 20 methods, 40 results, 20 discussion, 5 code

Remember – there is not one right way to go about this project (in fact, there are numerous correct approaches). The idea here is to take all you have learned this semester and apply it to “real” data in a way that makes sense to you! Do not forget the cleaning process – this is just as (if not more) important that the final analysis.

发表评论

电子邮件地址不会被公开。 必填项已用*标注