Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
P8483:
Application of Epidemiologic Research Methods
Spring 2025 Syllabus
DRAFT and subject to change throughout semester!
In person sessions
Tuesdays 5:30-6:50pm.
Hammer 401
INSTRUCTOR
Matthew Lamb, PhD, MPH, MS
Assistant Professor
Department of Epidemiology
ICAP- Columbia University
Office Hours: Check CourseWorks
COURSE OBJECTIVES
1. Think logically!
2. Use SAS to walk through the steps of an epidemiologic analysis from data entry through regression analysis.
3. Operationalize conceptual epidemiologic and biostatistical concepts covered in other classes through application with real data sets.
4. Learn to summarize your research findings with succinct scientific writing.
5. Develop skills in “prompt engineering” to use generative AI to assist with writing computer code.
The material builds on concepts introduced in Quantitative Foundations in Public Health and is intended to complement and serve as a bridge to the methods presented in more advanced epidemiology courses.
WEEKLY STRUCTURE AND EXPECTATIONS
This class is a hybrid.
Recorded lecture material is supplemented by weekly in-person small group exercises.
In-person attendance is expected.
Most weeksfollowing Week 1 will have the following structure.
Tuesday 530-650 pm. Synchronous session.
· First 15 minutes: Questions on asynchronous material from last week
· Second 50 minutes: small group breakout sessions focusing on completing specific in-class assignments
· Last 15 minutes: review of small group exercise as a class
Tuesday following Synchronous session:
· Asynchronous material for the following week is released
o Slides with embedded recorded audio and video
o Reading assignments
o Homework assignment
Wednesday am – Monday pm
· Attend office hours as needed
· Watch asynchronous material
· Do readings
· Work on homework
All homework assignments will be due at 11:59 pm on the Monday prior to the synchronous session, with a no-penalty grace period until 7:59 am on Tuesday morning. No extensions will be provided. No exceptions.
PREREQUISITES
· Prior completion of the Core, P6400, or equivalent
· Registered for the course on SAS On Demand
OPTIONAL
· This year I am exploring ways to integrate the R statistical program into this class using generative AI. There will beoptional materials using R available. If interested, please download R and R Studio.
SKILLS TO LEARN
The primary objective of this course is to provide you with the tools necessary to import, clean, error-check, operationalize, analyze, and disseminate data from epidemiologic research studies. In this class we will be using the SAS statistical software package, but the logical processes this course develops will be useful across any statistical package.
Generative AI tools such as ChatGPT are excellent resources at helping you to write statistical programming commands for epidemiologic analyses, and we will explore how best to use these tools in this class as well.
By the end of the course, you should be able to:
· Understand and implement the steps involved in data collection, management, data quality assurance, operationalization, descriptive analysis, and multivariable regression analyses using SAS
o Read raw data from a variety of formats into SAS
o Peruse, manipulate and clean data sets through printing, sorting, merging, and the use of conditional logical expressions
o Apply simple statistical and graphical procedures for the descriptive analysis of normally distributed data
o Conduct correlation, linear, and logistic regression in SAS and interpret SAS output for these analytical methods
· Understand the concepts of statistical model building
· Understand the difference between statistical model building and multivariable analysis for causal inference
· Understand the purpose of indicator variables (“dummy” variables), how to create these, and how to interpret output for these variables
· Conduct multivariable data analyses in SAS
· Understand the concept of confounding and how to use standard methods to remove confounding
· Develop skills in succinct summarization of findings from an epidemiological analysis through writing a scientific abstract.
ASSESSMENT AND GRADING POLICY
Assessment for this course is based on homework assignments, a mid-term exam, a final project, and weekly in-class laboratory assignments. The contribution of each grading assessment toward the final grade is as follows:
Assignments: 45%
In-class group exercises: 8%
In class check-ins: 2%
Midterm exam: 25%
Final project: 20%
Letter grades will be assigned by the instructor based on the following general rubric.
No rounding up.
A+ 99-100% Highly Exceptional Achievement
A 94-98% Excellent. Outstanding Achievement
A- 90-93% Excellent, close to outstanding
B+ 88-89% Very good. Solid achievement expected of most graduate students
B 84-87% Good. Acceptable achievement
B- 80-83% Acceptable achievement, but below what is generally expected
C+ 78-79%
C 74-78%
C- 70-74%
Midterm Exam:The take-home midterm will take place during the approximate mid-point of the semester and will be similar in format to weekly homework assignments. More detail will be provided during the semester.
Homework Assignments:There will be approximately 8 graded homework assignments. Assignments are to be completed and saved in SAS as enhanced editor files (.sas extension or .txt extension). Since most of you will not download SAS onto your computer, saving your SAS editor file as a .txt extension will allow you to actually view it prior to submission.
Your name and Columbia email UNI must be included in all assignment submissions.
All course assignments will be turned in electronically via CourseWorks. At the end of the semester, the top (n-1) grades will be used in compiling a student’s final grade. There will be no accepting late homework submissions.
In-class check ins
During weekly in-person classes, you will be given an in-person learning check at the beginning of each class. Grades will be based on completion. These will count toward 2% of your grade. To receive full credit you will need to be physically present in class for these. The two lowest scores on these will be excluded from your final grade.
In-class group exercises
The most important part of the Tuesday in-person sessions are the group exercises. You will work with your randomly assigned laboratory group on an assignment. These will count toward 8% of your grade. These will be due after the class and graded for reasonable effort.All lab group members are expected to participate in order to receive credit for the lab assignment. During submission, you will be asked to attest to which members were (1) present in person (2) present remotely or (3) not present.Credit will be given to all members present in person or remotely.
Final Project:The objective of the final group project is to provide students with experience in analyzing data from a large scale data set. Specific details will be provided later on in the semester.
Late Assignment Policy:
Late assignments will not be accepted under any circumstances.There may be instances throughout the semester where SAS studio server becomes unavailable for short periods of time. This is out of our control. Thus, please start your assignments promptly, and save your working SAS editor files externally in case your work needs to be completed on a computer using PC SAS.
HONOR CODE
In fairness to your colleagues, it is assumed that all students will adhere to the University Honor Code. A copy of the code is available at the link below. Note that the honor code requires reporting of all breaches. If there are any rules with which you do not agree, please let us know so we can discuss the issue.
Use of Generative AI such as ChatBots
Chatbots such as ChatGPT are extremely useful in helping to write code for statistical programs. You are permittedand encouraged to use these ChatBots as a tool for helping to write your statistical code. However, youalways need to ensure that the code actually works.
Chatbots should not be used to write your final abstract. They still aren’t very good at this. However, they are useful in helping to reduce the word count of your final abstract and are permitted for this use.
https://www.mailman.columbia.edu/sites/default/files/pdf/community-standards-and-conduct.pdf
DISABILITY-RELATED ACADEMIC ACCOMODATIONS
In order to receive disability-related academic accommodations, students must first be registered with the Office of Disability Services (ODS). Students are invited to contact ODS for a confidential discussion at 212.854.2388 (V), 212.854.2378 (TTY), or by email at [email protected]. If you have already registered with ODS, please let me know to ensure that I have been notified of your recommended accommodations by Lillian Morales, Disability Services Liaison to the Mailman School of Public Health.
CONSULTATIONS
The goal of this course is to learn how (and when) to apply and interpret the techniques learned in the class, and to relate them to the causal theories under investigation. If you are having difficulty understanding the material, feel the course is not meeting your needs, or just want to chat about ideas related to the course, feel free to contact me. For short questions, email is most efficient. For longer discussions, let’s set up a meeting. If you don’t hear from me within 24 hours, please email again. All weekend emails will be answered by Monday morning.
Matthew Lamb
ENCOURAGING AN OPEN AND INCLUSIVE CLASSROOM ENVIRONMENT
The material in this class can be challenging. Full understanding and facility with applying these methods requires asking questions and interrogating details of the methods. It is therefore important that all students are included in the conversation and feel comfortable expressing themselves. An educational culture that encourages robust, open, and inclusive classroom environments is necessary for the achievement of that goal. The Department of Epidemiology is committed to helping foster this type of educational environment. This commitment is part of our collective enactment of the elements of the Public Health Oath in which we agree to "respect the rights, values, beliefs, and cultures of those individuals and communities with whom [we] work."
An inclusive classroom environment is undermined by microaggressions. Microaggressions are commonplace verbal, behavioral, and environmental indignities, frequently unintentional, that communicate hostile, derogatory, or negative sentiments about individuals on the basis of status characteristics such as race, ethnicity, gender, sexual orientation, religion, disability, etc. Those who commit a microaggression are usually unaware that they have demeaned another individual, but the consequences for those on the receiving end can be significant. Microaggressions harm individuals by making them feel invalidated, isolated, diminished, and marginalized. They harm the learning environment by making it less inclusive, open and productive.
Recognizing and addressing microaggressions can help mitigate these negative consequences and thereby maintain a robust classroom environment. We believe that responding to microaggressions in the classroom is a critical part of educational growth that leads to a better understanding of the sociocultural issues we seek to investigate as epidemiologists and public health professionals. If you have observed or been the target of a
microaggression from a classmate, TA, or faculty member, you are encouraged to bring it to their attention when it happens. Faculty and TAs are willing and prepared to facilitate such engagement, even if they are responsible for the microaggression. If you are uncomfortable speaking up immediately, please contact the TA or instructor outside of class.
COURSE REQUIREMENTS
1. Students are expected to attend class and complete all assigned readings and course assessments. It is expected that students will dedicate ~9-12 hours per week to this course.
2. Students are required to have access to a laptop computer
3. Students must register for SAS on Demand prior to class following the instructions sent out via email and available on CourseWorks
a. I will be updating the content of this course with SAS data sets and SAS Editor Files as we proceed throughout the semester. For those of you opting to use SAS that you have purchased and installed on your computer, duplicate files will be available of all course material on CourseWorks under "Files".
b. For more information about SAS OnDemand for Academics, including step-by-step registration instructions, visit the following site: http://support.sas.com/ondemand.
4. It is not required that students have a licensed version of SAS installed on their laptop. However, this might be useful for you going forward. Licenses are available for purchase through CUIT. Please see instructions for SAS purchase and installation on the course website
a. https://secure.cumc.columbia.edu/cumcit/secure/policy/sas-spss.html
5. SAS is also installed on University computers in the Health Sciences library.
REQUIRED TEXTS
This course has two required textbooks, both available for free as soft copy (link available on CourseWorks).
1. Delwiche and Slaughter. The Little SAS Book: A Primer. 5th edition. 2012, SAS Institute. ISBN 978-1-61290-343-9. Available as .pdf through Columbia Library here:
a. https://clio.columbia.edu/catalog/10797775
2. DiMaggio. SAS for Epidemiologists. Application and Methods. 1st edition. 2012, Springer. ISBN 978-1-4614-4853-2. Available as .pdf through Columbia Library here:
a. https://clio.columbia.edu/catalog/10192533
As necessary, additional readings will be posted on Courseworks, along with data sets, sample SAS codes, and assignments.
Weekly Schedule
· THE COURSE SCHEDULE (BELOW) IS A DRAFT AND SUBJECT TO CHANGE. PLEASE VISIT THE COURSEWORKS WEBSITE FOR UP-TO-DATE SCHEDULES.
Week 1
· 1/21 5:30-6:50: Synchronous Session: Tuesday
o Course Overview and expectations settings
o Comparing SAS and R
o Introducing SAS Studio and the SAS working environment
o Writing your first SAS program
o Saving a SAS Editor file to your personal computer as a .txt file.
o ChatGPT as a tool
· Asynchronous videos
o 1_1: the SAS environment
o 1_2: common basic PROC statements in SAS
o 1_3: The LIBNAME Statement
· Assigned reading
o DiMaggio
§ Chapter 1
§ Chapter 2.1-2.6, 2.9-2.10
§ Chapter 3.1-3.3
§ Chapter 4.1-4.2
§ Chapter 6.1-6.2
Week 2
· 1/28 5:30-6:50: Synchronous Session: Tuesday
o Review of LIBNAME and basic PROC statements
o Temporary vs Permanent SAS data files
§ Introducing the “WORK” libref
o The DATA statement
o Sorting, merging, and concatenating SAS data sets
o ChatGPT as a tool
· Asynchronous videos
o 2_1 The WORK libref and the utility and risk of temporary SAS data files
o 2_2 the DATA step, part 1
o 2_3 combining separate SAS data files: sort, merge, concatenate
o 2_4 Transposing SAS data files
o Optional 2_5: equivalent procedures in R
· Assigned reading
o Delwiche & Slaughter
§ Chapter 1.2-1.4, 1.10, 2.18-2.21
o DiMaggio
§ Chapter 2.7-2.8, Chapter 5
· Homework #1 assigned. Due February 3 at 11:59 pm
Week 3
· 2/4 5:30-6:50: Synchronous Session: Tuesday
o Review of the DATA step, sorting, merging, and concatenating
o Introducing logical statements: if, where, and by
o ChatGPT as a tool
· Asynchronous videos
o 3_1 creating a SAS data file in the editor panel
o 3_2 introduction to SAS INFORMAT and FORMAT statements
o 3_3 more practice with logical statements
· Assigned reading
o Delwiche & Slaughter
§ Chapter 3
o DiMaggio
§ Chapter 3.5, 3.7, Chapter 5
· Homework #2 assigned. due Feb 10 at 11:59 pm
Week 4
· 2/11 5:30-6:50: Synchronous Session: Tuesday
o Review of SAS formats and informats
o A first look at creating variables with logical statements
o ChatGPT as a tool
· Asynchronous videos
o 4_1 the if.. then.. else if.. then .. syntax for variable creation
o 4_2 Importing non-SAS data files into SAS: csv files and txt files
o 4_3 Importing non-SAS data files into SAS: Microsoft Excel files
o 4_4: Importing SAS Transport Files into SAS
o Optional 4_5: logical statements in R
· Assigned Reading
o Delwiche & Slaughter
§ Chapter 2
o DiMaggio
§ Chapter 3
· Homework #3 assigned. Due Feb 17 at 11:59 pm
Week 5
· 2/18 5:30-6:50: Synchronous Session: Tuesday
o Review of Proc Import and other ways of importing
o More on creating and modifying variables in SAS
· Asynchronous videos
o 5_1 do loops for variable manipulation
o 5_2 arrays for variable manipulation
o 5_3 the peculiar case of SAS DATES and DATETIMES
o ChatGPT as a tool
· Assigned Reading
o Delwiche & Slaughter
§ Chapter 3.8, 3.11
o DiMaggio
§ Chapter 5.7
· Homework #4 assigned. Due Feb 24 at 11:59 pm
Week 6
· 2/25 5:30-6:50: Synchronous Session: Tuesday
o Review of SAS DATES, formats, and informats
o SAS Labels
o ChatGPT as a tool
· Asynchronous videos
o 6_1 review of basic descriptive analyses: proc FREQ
o 6_2 proc UNIVARIATE
o 6_3: Level up: replicating a Table 1 from a published analysis
o Optional 6_4: creating descriptive statistics in R
· Assigned reading
o Delwiche & Slaughter
§ Chapter 4.13-4.17
o DiMaggio
§ Chapter 6.3
· MIDTERM assigned. Due March 3 11:59 pm. 25% of final grade.
Week 7
· 3/4 5:30-6:50: Synchronous Session: Tuesday
o PROC UNIVARIATE
o PROC FREQ for stratified analysis
o Review of bivariate measures and stratified analysis
o ChatGPT as a tool
· Assigned reading
o DiMaggio Chapter 6-8
o Other reading TBD
· Asynchronous videos
o 8_1 Model building for regression, part 1
o 8_2 Model building for regression, part 2
o 8_3 Model building for regression, part 3
· Homework # 5 assigned. Due March 11 at 11:59 pm
Week 8
· 3/11 5:30-6:50: Synchronous Session: Tuesday
o Introducing linear regression in SAS
o ChatGPT as a tool
· Asynchronous videos
o 9_1 dummy variable coding
o 9_2 proc reg
o 9_3 proc glm
o Optional 9_4: regression in R
· Assigned Reading
o DiMaggio Chapter 13-14
· Homework # 6 assigned. Due March 24 at 11:59 pm
Week 9. Spring Break. No classes.
Week 10
· 3/25 5:30-6:50: Synchronous Session: Tuesday
o Review of linear regression
o ChatGPT as a tool
· Asynchronous videos
o 10_1 logistic regression, part 1
o 10_2 logistic regression, part 2
o 10_3 optional: relative risk regression
o 10_4 optional: logistic regression in R
· Assigned Reading
o DiMaggio Chapter 9
· Homework # 7 assigned. Due March 31 at 11:59 pm
Week 11
IPE day no class
Week 12
· 4/8 5:30-6:50: Synchronous Session: Tuesday
o Review of logistic regression
o Power and sample size teaser
o ChatGPT as a tool
· Asynchronous videos
o 11_1 Power and sample size in SAS, dichotomous outcome
o 11_2 Power and sample size in SAS, continuous outcome
o 11_3 Power and sample size in SAS, further information
· Homework # 8 assigned. Due April 14 at 11:59 pm
· FINAL GROUP PROJECT ASSIGNED
o One paragraph idea due April 14
o DUE MAY 5 at 11:59 pm
Week 13
· 4/15 5:30-6:50: Synchronous Session: Tuesday
o Power and Sample size
o ChatGPT as a tool
· Asynchronous videos
o 12_1 causal language in epidemiologic studies
o 12_1 effect measure modification
· You should be working on your abstracts
Week 14
· 4/22 5:30-6:50: Synchronous Session: Tuesday
o Statistical tests for effect measure modification
o TBD, Catch-up as necessary
· Asynchronous videos
o 13_1 PROC TABULATE
o TBD other videos
Week 15
· 4/28 5:30-6:50: Synchronous Session: Tuesday
o Writing a scientific abstract
Week 16
· 5/5 5:30-6:50: Synchronous Session: Tuesday
o Help on final group project
o OPTIONAL!