Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
PIA3067 ADVANCED RESEARCH METHODS ∗† (Semester B 2024/25)
Course Description
This course is the second in a two-semester sequence on quantitative research methods in the social sciences, with applications to political and policy analysis. Building on foundational concepts introduced in the first semester, this course emphasizes practical skills in designing research, managing and analyzing data, and drawing causal inferences using statistical programming language. It aims to equip students with the ability to understand, explain, and perform social science research, focusing on data analysis, causal reasoning, and critical engagement with the methodologies found in academic literature. Students will develop the knowledge and practical skills needed to choose appropriate strategies and procedures for collecting, managing, analyzing, interpreting, and reporting quantitative data, ultimately enabling them to address key empirical puzzles in their independent research, including final-year capstone projects. By fostering data literacy and proficiency in sophisticated an alytical techniques, the course not only prepares students for academic and professional research but also positions them to navigate the growing demands of a data-driven job market as critical consumers and producers of data-centered evidence.
Learning Outcomes
Upon successful completion of the course, students will be able to:
• Identify research questions and determine to what extent they can be answered with data and identify the kind of data that can shed light on these research questions.
• Identify, collect, process, and manage data to allow for analysis.
• Describe and visualize data using statistical software
• Identify appropriate ways of analysing a given data set to answer a research question and conduct the analysis using statistical software.
• Critically assess assumptions and consider alternative explanations.
• Communicate insights from data analysis effectively and engagingly.
Course Logistics
Canvas page: https://canvas.cityu.edu.hk/courses/63596
We will be using Canvas to host the course webpage. Course materials will be posted on the site. Unless specified otherwise, all the assignments including the final project will also need to submit through Canvas.
Office hours and availability
• Booking page: https://calendly.com/baole/office-hours
• Tuesday 11:00 AM - 1:00 PM (In person);
• Thursday 2:00 - 4:00 PM (Zoom);
• Or by appointment
If you have questions about the course material, lectures, exercises, and other course-related issues, please do not hesitate to stop by my office hours or set an appointment. Please see the appointment page for detailed information. Drop in at office hours is fine but signing up on the booking page is highly recommended. If you have a quick or general question, you can also post it on Slack. This can be a faster way to get an answer. However, you can also always email me directly at
[email protected].
Slack: https://pia3607advanc-naz6632.slack.com
We will use Slack to facilitate discussions outside the classroom. This is an ideal forum for posting questions and information regarding the course material and/or computing. The goal is for everyone to benefit from the discussion and collective knowledge. I encourage everyone to also reply to each other’s questions when they know the answer and a student’s respectful and constructive participation on Slack will count toward his/her class participation grade.
Computers and Notes in Class
Although the experiments are relatively small, longhand writing appears to be a superior strategy for taking notes under certain conditions. See https://bit.ly/takegnotes for a summary brief. At least, there is no evidence that note-taking via laptop is beneficial:
Pam A. Mueller and Daniel M. Oppenheimer. 2014. “The Pen is Mightier than the Keyboard: Advantages of Longhand over Laptop Note Taking.” Psychological Science 25 (6): 1159 - 1168
Course Materials
Required Readings
The primary textbook for the course is
Kosuke Imai. 2018. Quantitative Social Science: An Introduction (QSS). Princeton University Press
The book can be purchased through CityU Textbook Service (https://linktr.ee/cityubookstore). Re quired readings are expected to be completed before each class meeting.
To supplement the primary textbook, additional readings will focus on more technical descriptions of statistical methods:
Matthew Blackwell. A User’s Guide to Statistical Inference and Regression (SIR). Available at: https://mattblackwell.github.io/gov2002-book/.
Andrew Gelman, Jennifer Hill, and Aki Vehtari. 2020. Regression and Other Stories (ROS). Cambridge University Press. QFree online PDF is available at: https:// users.aalto.fi/~ave/ROS.pdf.
Further readings on specific methods and examples will also be provided throughout the course. There will occasionally be quiz questions based on the assigned readings during class sessions.
The textbook for R programming is:
Hadley Wickham, Mine Cetinkaya-Rundel, and Garrett Grolemund. 2020. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (R4DS). 2nd ed. O’Reilly Media. Available at: https://r4ds.hadley.nz/.
We will not be able to discuss all the specifics in this book during the class meetings, but you should use it as your reference for R programming. We will cover some important topics in the book during the lab sessions. You can also always bring up questions about the book.
Additional Readings
Richard D. De Veaux, Paul F. Velleman, and David E. Bock. 2021. Stats: Data and
Models. 5th ed. Pearson.
David M. Diez, Christopher D. Barrm, and Mine Cetinkaya-Rundel. 2022. Open-Intro
Statistics. 4th ed. Available at: https://www.openintro.org/book/os/.
David Freedman, Robert Pisani, and Roger Purves. 2007. Statistics. 4th ed. W. W.
Norton & Company.
Quan Li. 2018. Using R for Data Analysis in Social Sciences: A Research Project
Oriented Approach. Oxford University Press.
Jeffrey M. Wooldridge. Introductory Econometrics: A Modern Approach. 7th ed. Cen
gage Learning
Joshua D. Angrist and J¨orn-Steffen Pischke. 2008. Mostly Harmless Econometrics: An
Empiricist’s Companion. Princeton University Press.
Joshua D. Angrist and J¨orn-Steffen Pischke. 2015. Mastering Metrics: The Path from
Cause to Effect. Princeton University Press.
Scott Cunningham. 2021. Causal Inference: The Mixtape. Yale University Press.
Jeff Gill and Michelle Torres. 2019. Generalized Linear Models: A Unified Approach.
SAGE Publications.
Computing
We’ll use R and RStudio in this class to conduct data analysis. R is free, open-source programming language, and available on all major platforms. RStudio (also free) is a graphical interface to R, which is widely used to work with the R language. You can (and should) install R and RStudio on your own computer. But you can also use RStudio through RStudio Cloud (https://rstudio.cloud/) or CityU Virtual Desktop Service (VDS) (more info: https://www.cityu.edu.hk/csc/deptweb/ support/faq/vds_faq.htm).
There are resources on using R from the library at CityU (https://libguides.library.cityu. edu.hk/c.php?g=423971&p=7041652). You can also find a virtually endless set of resources for R on the Internet. For beginners, there are several web-based tutorials including one from DataCamp (https://www.datacamp.com/). In these, you should be able to learn the basic syntax of R. More R resources will be posted on Canvas and Slack. You’re also welcome to use other programming languages for finishing the course assignments, but the support for other languages from this course is limited.
Requirements and Evaluations
Students are required to do the weekly reading, attend class, complete all assignments, and contribute to course discussions. The student’s final course assessment includes the following components:
Assignment Weight Important Date
Problem sets 40% Week 6 & Week 11
Midterm test 20% Week 8
Final project (incl’ presentation) 40% Week 13
Participation
Class participation is a key to your success in this course, because all the assignments are based on what we discuss in class. Attendance is required to all class sessions, where it will be tracked. You should also aim to participate actively and thoughtfully to in-class discussions and Slack conversations, whenever there are opportunities to do so. Preparing the required readings ahead of class will greatly improve your ability to make the most out of lectures and labs.
Lab
The best way to learn methods and programming is learning by doing. The last part of the class time is lab session. During the lab sessions, you will work with randomly-selected partners on data analysis tasks. The tasks will reflect methods we’ve studied in class, but will require applying them to new data and new situations. I will be available to answer questions and occasionally I will bring up some important points to discuss together. You will submit your lab exercises by the end of the class. The lab exercises ill not be strictly graded. It is for us to learn and practice things together. The answer keys will be posted after the class.
Problem sets
The two problem sets should be completed outside of class. They should be done in R Markdown and submitted as both PDF and Rmd file on Canvas before the due time. The problem sets will contain questions similar to lab exercises. The problem set is an individual assignment; no formal teamwork or collaborations are allowed. You may discuss with others on the problem sets, but every keystroke of your submission must be your own. You may not copy code or answers from others or AI tools, but you may get all the help available (classmates, university centers, Internet resources, etc.). You are responsible for understanding and being able to explain every line of code you submit. All the problem sets are due on Tuesdays before the class at 8:30 PM.
Mid-term test
In Week 8, you will take an in-class mid-term test, which comprises a series of structured or open ended questions designed to assess your ability to apply methodological concepts to specific issues in social research. Detailed information about the test format and assessment criteria will be provided closer to the date.
Final project
For the final project, you will engage in original social science research. You will define your own research question and use the skills learned through the course to best answer it. You will need to find a dataset of interest, pose an appropriate research question that the data can answer with quantitative methods, analyze the data, write a short data analysis report, and present your research. The report must engage in the existing literature and provide appropriate intellectual context for the question you pose. You are welcome to augment the data provided with any other appropriate data you need (this is optional, but this sort of bridging is typical in real research and often defines the most innovative social science work.). While the dataset and topic can be freely chosen, it must be your original research based on a suitable question and original analysis.
Progress towards the final paper will be made through multiple milestones. The milestones are graded as “complete” and will be given feedback and will be considered when grading the final project. The milestones should demonstrate that you had been making enough progress in each stage and keeping improving the project. All the milestones are due on Thursdays at 8:30 PM.
Milestone Content Due date
Research Proposal
Two page research proposal including research question, literature review, and research plan. It should discuss the research question, potential hypotheses and how they are situated in the literature (with at least three citations).
6 March
Draft Data analysis
An R/R Markdown script containing data cleaning and preliminary data
analyses. It should demonstrate you have explored and familiarize your
self with the data. Some of the analyses and results may not end up in
the final product.
3 April
Final paper
A combination of the pieces you have constructed throughout the previ
ous milestones. You should connect different parts of the paper smoothly
and articulate and explain the results and implications.
24 April
Weekly Schedule
Below is the schedule for the semester. Please refer to the previous sections for specific requirements.
The plan is to cover one topic per week, but we will go as fast/slow as needed to make sure that
everyone understands the material. Make sure you check the Canvas page every week to know what
we will be covering in the upcoming class.
Week 1. 15 January
Introduction
□ Reading: Syllabus
□ Module: Introduction
□ Lab: Statistical computing
□ Homework: Problem Set 0
Week 2. 22 January
Causality
□ Due: PS0 is due at 8:30 PM on 21 January
□ Reading: QSS Chs. 1-2, R4DS Ch. 2
□ Module: Causality
29 January
Lunar New Year Break
□ No class
Week 3. 5 February
Measurement, Descriptive Statistics, & Data Visualization I
□ Reading: QSS Ch. 3, ROS Ch. 2, R4DS Ch. 1
Week 4. 12 February
Measurement, Descriptive Statistics, & Data Visualization II
□ Reading:
□ Homework: PS1 will be posted after the class.
5Week 5. 19 February
Prediction, Correlation, & Regression
□ Reading: QSS 4.1-4.2, ROS Chs. 6
□ Homework: Research proposal
Week 6. 26 February
Regression: Estimation & Inference
□ Due: PS1 is due at 8:30 PM on 25 February.
□ Reading: QSS Chs. 4.3.1-4.3.3, 6-7, ROS Chs. 7, 10
Week 7. 5 March
Review Session
□ Due: Research proposal is due on 6 March.
□ Reading: QSS Ch 4.3, ROS Chs. 11-12
Week 8. 12 March
Midterm Test
□ No readings and lectures
Week 9. 19 March
Probability & Inference
□ Reading: QSS Ch. 6
Week 10. 26 March
Regression: Hypothesis Testing and Uncertainty
□ Reading: QSS Ch. 7
□
Homework:
– PS2 will be posted after the class.
– Draft data analysis is due at 8:30 PM on 3 April.
Week 11. 2 April
Regression: Interactions, Non-linearities, and Transformation
□ Due: Draft analysis is due at 8:30 PM on 3 April.
□ Reading: QSS 4.3.3, ROS Chs. 12-13
Week 12. 9 April
Advanced Topic: Neural Networks and Large Language Models
□ Due: PS2 is due at 8:30 PM on April 8.
□ Readings (optional):
– Breiman, Leo. 2001. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3):
199-231.
– Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. 2021. “Machine Learning
for Social Science: An Agnostic Approach”. Annual Review of Political Science 24 (1):
395-419.
Week 13. 16 April
Final Project Presentation
□ Due: Presentation slides are due at 11:00 PM on 15 April.
6Policy
Late policy
Students are expected to meet their submission deadlines for all their assignments. If you hand in your work late you will automatically incur late submission penalties. A deduction of 10% of the grade (i.e., one letter grade) shall be imposed on each of the subsequent working days after the deadline.
For example, where a deadline is 10:00 PM on a Thursday, a 10% penalty shall be deducted at 10:10 PM on Thursday. If you submit it, say at 10:01 PM on Friday (more than one day after the deadline), your total penalty is 20%. However, if the assignment is submitted more than five days after the deadline, it will be deemed a non-submission and be given a mark of zero.
There may be, of course, “mitigating” circumstances that affect your ability to complete your assign ments in time, such as sudden short-term illness or some serious personal situations. If you are likely to miss a deadline, you should discuss your circumstances with your lecturer as early as possible. If, because of mitigating circumstances, you are unable to complete assignments with a total weighting of 20% or above, you need to inform the PIA department as soon as possible. You must report your case via AIMS no later than five working days of the assignment deadline. To understand what counts as mitigating circumstances and how to report them, please visit the Academic Regulations and Records
Office website.
Intellectual property
Course content is the intellectual property of the instructor or student who created it, and may not be recorded or distributed without consent. Students are not permitted to make visual or audio recordings, including live streaming, of classroom lectures or any class related content, using any type of recording devices (e.g., smart phone, computer, digital recorder, etc.) unless prior permission from the instructor is obtained, and there are no objections from any of the students in the class.
Academic integrity
Plagiarism is when you present someone else’s work as your own. Plagiarism is a form of deception and theft, as well as a serious academic offence. It may result in failure of an individual assignment or course, and, in serious cases, in expulsion from the University. A work is also considered to be plagiarised where it has been submitted a second time (or more) by the same student for a course or courses beyond the original course in which the item was first assessed. A work may further be considered to be plagiarised where it has been earlier submitted by another student.
If you are confused about the question of plagiarism or if you need to check if you have plagiarised materials please talk with your tutor or lecturer or the leader of your programm. They are here to help and advise you. It is always better to be actively checking your work rather than making mistakes that could have been easily rectified.
All work submitted for this course may be checked with plagiarism identification tools. If work is submitted that contains multiple plagiarised passages and/or more extensive evidence of direct, unattributed citation of sentences then that piece of work will be referred to a committee, which will conduct a formal investigation.
AI Usage
In general, along with the University, I encourage students to explore and use Generative AI (GenAI) tools to enhance their learning. However, it is essential that you first develop a foundational understanding of the course material before relying on AI assistance. If not specifically permitted, you are discouraged from using AI tools to generate content (code, text, video, audio, images, etc.) that will end up in any student work (assignments, activities, responses, etc.) that is part of your evaluation in this course. In the case of using AI tools, you must clearly indicate which portions of your work were generated with AI, providing an explanation of why you adopted the AI-generated content.
AI-generated content should not exceed 25% of your submission. To promote transparency, critical thinking, and ethical use of AI, we will also discuss the mechanics of these tools during class to better understand their capabilities and limitations.
Remember that academic honesty is central to all coursework. Misuse of AI tools or failure to properly acknowledge their use may result in violations of academic integrity policies. If you are unsure about AI’s role in a particular assignment, please consult with me before submitting your work.